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FOREWORD 


The  Arny  Mathematics  Steering  Committee  (AMSC)  sponsors  annually 
the  Army  Conferences  on  Applied  Mathematics  and  Computing.  As  the 
title  Indicates  these  meetings  deal  with  the  mathematics  needed  to 
understand  the  world  around  us.  This  Is  a  contrast  with  core 
mathematics,  which  In  the  main,  does  not  deal  directly  with  events 
and  objects  of  the  physical  world.  Since  very  few  of  the  papers 
presented  at  the  four  conferences  held  to  date  were  In  the  field 
of  pure  mathematics,  these  meetings  are  rightly  named.  The  U.S. 
Military  Acadeny  served  as  the  host  of  the  fifth  meeting  in  this 
series,  which  was  held  at  West  Point,  New  York,  on  15-18  June 
1987.  Colonel  David  Cameron  served  as  Chairperson  on  Local 
Arrangements.  He  was  assisted  with  this  task  by  Majors  David 
Arney  and  Scott  Huxel.  The  members  of  the  AMSC  would  like  to 
thank  these  three  Individuals  for  all  their  efforts  in 
coordinating  the  many  details  needed  to  conduct  this  successful 
scientific  meeting. 

The  program  of  this  years  conference  consisted  of  three  parts, 
namely:  (a)  Contributed  papers  by  Army,  academic  and  other 
scientific  personnel;  (b)  Three  special  sessions;  and  (c)  Seven 
invited  addresses.  There  were  more  than  fifty  contributed  papers 
presented  in  the  technical  sessions.  About  half  of  these  papers 
were  contributed  by  scientists  from  ten  Arny  installations.  These 
presentations  gave  the  attendees  an  opportunity  to  hear  about 
scientific  research  being  conducted  within  these  laboratories. 

The  topics  for  the  special  sessions  were  organized  in  three 
different  areas,  namely,  stochastic  analysis,  solid  modeling  and 
CAD/CAM,  and  mathematical  aspects  of  composites.  For  the  invited 
speaker  phase  of  the  meeting,  the  Program  Committee  obtained  the 
services  of  the  following  nationally  known  scientists  to  talk  on 


topics  of  current  interest  to  Army  personnel: 


SPEAKERS  AND  AFFILIATION 

Professor  David  Munford 
UR  I  Center  on  Intelligent 
Control  Systems 
Harvard  University 

Professor  Roland  Glowlnski 
University  of  Houston 


Dr.  Sukmar  Chakravarthy 
Rockwell  International 
Corporation 


TITLE  AND  ADDRESSES 

Some  Mathematical  Problems 
Arising  from  Computer 
Vision 


On  the  Numerical  Solution 
of  Time  Dependent  Problems 
in  High  Dimensions 

Unified  Euler  and  Navler- 
Stokes  Numerical  Methods 
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Professor  Robert  Taylor 
University  of  Callfornla- 
Berkeley 

Professor  Charles  VanLoan 
Mathematical  Sciences 
Institute 

Cornell  University 

Professor  Anthony  Jameson 
Princeton  University 

Professor  James  Gllmm 
Courant  Institute 


Computation  Mechanics: 
Today  and  Tomorrow 


Parallel  Matrix 
Computations  on  Loosely 
Coupled  Systems  of  Array 
Processors 

Computational  Methods 
for  Transonic  Flow 

The  Interaction  on 
Nonlinear  Waves 


The  success  of  the  conference  was  due  to  many  Individuals,  the 
active  and  enthusiastic  members  of  the  audience,  the  chairperson, 
and  the  many  speakers.  The  members  of  the  AMSC  were  pleased  with 
the  fact  that  most  of  the  speakers  were  able  to  find  time  to 
prepare  papers  for  the  Transactions.  These  research  articles  will 
enable  many  persons  that  were  not  able  to  attend  the  symposium  to 
profit  by  these  contributions  to  the  scientific  literature. 
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Abstract  | 

We  use  Lie  transforms  to  approximate  the  Poincare  map  of  a  weakly  non-  1 

linear  periodic  perturbation  of  the  simple  harmonic  oscillator  in  order  to  study  I 

the  stability  of  the  trivial  solution.  Resonant  frequencies,  corresponding  to 
nonremovable  terms  in  the  differential  equation,  are  identified  through  0(e2). 

We  show  that  detuning  from  resonance  stabilizes  the  trivial  solution  when 
the  perturbation  contains  no  linear  periodic  terms.  Finally,  we  study  a  typi¬ 
cal  bifurcation  between  two  lowest-order  resonant  frequencies.  A  MACSYMA 
program  which  performs  the  Lie  transform  algorithm  to  arbitrary  order  is  pre-  j 

sented  in  the  appendix  with  a  sample  run. 

1  Introduction 

In  this  paper  we  present  some  results  concerning  the  stability  of  the  trivial 
solution  of  the  equation 

x  +  u!2x  +  if(t,x)  =  0  (1) 

where  f(t,x)  is  T-periodic  in  t.  Taylor- Fourier  expandable  in  x  and  t  respec¬ 
tively,  and  f(t,x)  satisfies  f{t,  0)  =  0.  The  hamiltonian  structure  of  eq.(l) 

‘This  work  was  partially  supported  by  NSF  grant  85-09-181  and  by  the  Army  Research  Office 
through  the  Mathematical  Sciences  Institute,  Cornell  University,  Ithaca.  NY  14853 


permits  us  to  use  Lie  transforms  to  reduce  the  nonautonomous  hamiltonian 
induced  by  eq.(l)  to  an  autonomous  one  by  means  of  a  periodic  canonical  near- 
identity  transformation.  The  resulting  autonomous  hamiltonian  describes  the 
Poincare  map  in  a  neighborhood  of  the  origin. 

Analysis  of  the  Poincare  map  gives  substantial  information  concerning  the 
original  equation.  The  presence  of  a  periodic  point  in  the  Poincare  map  implies 
the  existence  of  a  periodic  orbit  in  the  original  equation.  In  particular,  a 
periodic  saddle  point  corresponds  to  a  hyperbolic  periodic  orbit,  and  a  periodic 
center  corresponds  to  an  elliptic  periodic  orbit. 

We  begin  by  describing  the  Lie  transform  algorithm  as  used  in  this  work. 
We  then  present  a  theorem  which  defines  the  0(f)  and  0(c2)  resonances  for 
the  general  case  of  eq.(l),  and  show  that  almost  all  higher  order  resonances 
are  stable. 

Next,  we  study  the  properties  of  the  trivial  solution  of  a  simple  equation  of 
the  type  (1).  We  identify  the  0(f)  and  0(f2)  resonances,  and  characterize  the 
stability  of  the  trivial  solution  for  all  nonzero  u>.  Results  of  the  Lie  transform 
analysis  are  compared  with  numerically  generated  Poincare  maps. 

Finally,  we  study  a  bifurcation  between  0(f)  resonances  of  cubic  and  quad¬ 
ratic  nonlinearities.  In  this  example,  a  4x-periodic  hyperbolic  orbit  becomes  a 
2r-periodic  hyperbolic  orbit  through  a  sequence  of  bifurcations. 


2  Results 


E 

I 


We  consider  the  general  equation 

i  +  w2z-(-f^5'0(t)iiVa_1  =0 


(2) 


where  the  ga{t)  are  periodic  and  the  Na  are  positive  integers.  This  equation 
was  studied  extensively  in  [1].  Here  we  summarize  some  results  and  refer  the 
reader  to  [l]  for  additional  information. 

In  canonical  variables  q  and  p,  eq.(2)  is  generated  by  the  hamiltonian 

h(q,P,t)=^+^  +  €^2^-gQ(t)qNa.  (3) 


The  change  of  variables 


q  =  \J2  J /u  sin(0  +  ut), 
p  =  V2Ju>cos(9  +  uit ) 


(4) 


reduces  eq.(3)  to  the  0(f)  hamiltonian 

,2 J 

U! 


H(J,0,t)  =  f£  ^„(0(— )W-/2sinN-(» +  ««)  =  f (5) 
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We  will  apply  the  Lie  transform  procedure  to  this  0(e)  hamiltonian. 


Definition  1  Let 


w(t,x,e)  =  w\(t,x )  +  ew2(t,x)  +  e2W3(t,x)  + 


be  the  Lie  generating  function  defining  a  canonical  transformation  which  re¬ 
duces  the  hamiltonian  (5)  to  an  autonomous  one.  Then  u  is  a  resonance  at 
0(e")  if  it  is  a  pole  ofwn(t,x)  but  not  of  Wk(t,x),  for  1  <  k  <  n  —  1. 


We  denote  by  fin  the  set  of  frequencies  which  are  resonant  at  0(e”). 

An  equivalent  definition  of  a  resonant  frequency  may  be  formulated  in  terms 
of  the  near-identity  transformation  generated  by  periodic  averaging. 

For  example,  the  0(en)  resonance  for  the  linear  Mathieu  equation 


x  +  ufix  +  c x  cos  t  =  0 


is  bj  =  n/2,  for  n  >  1.  Resonant  frequencies  correspond  to  non-removable 
terms  in  the  hamiltonian  (with  respect  to  Lie  transforms)  or  in  the  differential 
equation  (with  respect  to  periodic  averaging). 

In  order  to  show  how  to  generate  all  0(e)  and  0(e2)  resonances  for  eq.(2), 
we  introduce  some  notation.  By  assumption,  each  ga(t )  is  periodic  and  may 
therefore  be  expanded  in  a  Fourier  series.  Let 


be  the  Fourier  coefficients  of  ga (t)  for  integer  p.  Let  Ma  denote  the  set  of 
frequencies  of  ga(t ),  that  is, 


Ma  =  {^:CW/0,^Z). 


ga(t)  =  £  cjfV"*. 

As  shown  in  [1],  fii  consists  of  all  w  satisfying 


p  G  Ma 


U/  —  — - , 

Na  -  2v 

and  r?2  consists  of  all  u  satisfying 


0  <  v  <  Na. 


M  +  7 


Na  +  N0  -  2(v  +  6) 


p  €  Ma 
7  €  Ma 
0  <v<Na 
0  <  6  <  Ng 
6Na  #  vNg 


r. Tit. 


which  are  not  also  included  in  fli . 

It  is  also  shown  in  [l]  that  if  U  Sl2  U  {0},  then  the  resulting  reduced 

hamiltonian  is  of  the  form 

K  =  ef1(J)  +  e2f2(J)  +  --- 

where  f\  and  f2  contain  integer  or  half-integer  powers  of  J.  This  implies  that 
the  origin  of  the  Poincare  map  is  a  center,  and  therefore  the  trivial  solution  is 
a  stable  elliptic  orbit. 

We  show  here  that  if  the  perturbation  is  strictly  nonlinear  then  detuning 
creates  a  hyperbolic  periodic  orbit  which  traps  the  trivial  solution,  stabiliz¬ 
ing  it.  We  then  analyze  a  bifurcation  problem  between  periodic  orbits  near 
resonances  showing  how  a  4x-periodic  orbit  bifurcates  into  a  2x-periodic  orbit. 


3  Lie  Transforms 


An  important  characteristic  of  autonomous  hamiltonian  systems  is  that  the 
hamiltonian  is  constant  along  solutions  of  the  system  of  differential  equations. 
If  the  phase  space  has  dimension  two  then  the  solutions  are  level  curves  of  the 
hamiltonian.  The  reader  is  referred  to  (3]  or  [4]  for  a  complete  discussion  of 
hamiltonian  mechanics. 

In  this  work  we  use  Lie  transforms  to  reduce  eq.(3)  to  an  autonomous 
hamiltonian,  and  then  analyze  the  level  curves  of  this  autonomous  hamilto¬ 
nian  to  determine  the  behavior  of  solutions  which  have  initial  conditions  close 
to  the  trivial  solution.  The  implementation  of  the  Lie  transform  algorithm 
which  is  presented  here  implicitly  constructs  a  canonical  change  of  coordinates 
which  performs  the  reduction  to  an  autonomous  form.  It  is  obvious  that  no 
autonomous  canonical  change  of  variables  can  make  this  reduction.  Therefore 
the  hamiltonian  with  respect  to  the  new  coordinates  must  be  determined  by 
means  of  a  generating  function  or  some  equivalent  method  which  takes  into 
account  the  nonautonomous  nature  of  the  transformation.  The  Lie  transform 
method  is  an  efficient  perturbation  scheme  which  explicitly  generates  the  func¬ 
tional  form  of  the  reduced  hamiltonian  under  an  implicitly  defined  canonical 
periodic  near-identity  transformation. 

Let  x  and  y  denote  the  old  and  new  coordinates,  respectively.  Let  e  denote 
the  perturbation  parameter.  Let  H  denote  the  hamiltonian  with  respect  to 
the  x  coordinates,  and  let  K  denote  the  transformed  hamiltonian.  We  assume 
that  H  and  K  may  each  be  written  as  power  series  in  t. 


and 


H(t,x,e)  =  H0(t,x)  +  (Hx(t,x)  +  (2H2(t,x)  + 


I\(t,x,€)  =  A'o(t.z)  +  fA'i(t.x)  +  e2K2{t.x)  4- 


The  relation  between  x  and  y  is  defined  implicitly  in  terms  of  a  Lie  gener¬ 
ating  function  w(t,x)  as 

~  =  {xi,w)  (6) 


where  {  ,  }  is  the  Poisson  bracket  operator.  For  two-dimensional  phase  space, 
the  Poisson  bracket  operator  is 


r ,  i  df  dg  _  df  dg 
’  dxi  di2  di2  dx\ 


In  words,  the  new  coordinate  system  evolves  from  the  old  one  by  means  of  a 
“hamiltonian  flow”  in  the  evolution  quantity  c.  See  [3]  for  a  complete  discussion 
of  this  procedure.  It  is  straightforward  to  show  that  the  change  of  variables  x  — * 
y  defined  by  eq.(6)  is  canonical.  This  consists  of  showing  that  the  fundamental 
Lagrange  brackets  are  preserved  under  the  transformation. 

The  reduced  hamiltonian  K  is  related  to  H  by 


#1  +{«>!,#!}  +  ^£- 


H2  +  \{wuKi  +  Hx}  +  +  {W2,H0} 


Although  this  sequence  can  be  written  in  closed  form  to  arbitrary  order,  we 
need  it  only  through  0(e2).  See  [2]  for  full  details  of  this  topic. 

It  is  important  to  interpret  eq.(7)  correctly.  The  right-hand  side  of  each 
equation  is  a  function  of  x  and  t,  and  the  Poisson  brackets  are  computed  with 
respect  to  the  x  coordinate  system.  The  resulting  function  K  is  evaluated  at 
K(t,x).  The  x  are  dummy  variables,  and  may  be  replaced  by  y  to  give  the 
transformed  hamiltonian. 

The  sequence  (7)  gives  the  transformed  hamiltonian  for  an  arbitrary  gen¬ 
erating  function  w.  The  trick  is  to  choose  successive  tr,  to  make  the  corre¬ 
sponding  I\ ,  as  simple  as  possible.  This  means  choosing  u;,  at  the  it h  step 
such  that 


removes  as  many  terms  as  possible  in  the  right-hand  side  of  the  ith  equa¬ 
tion  in  (7).  While  this  operator  is  linear,  it  has  a  nontrivial  kernel;  therefore 
some  terms  may  not  be  removable.  In  the  context  of  periodic  perturbations, 
this  means  that  w,  cannot  be  chosen  to  make  K,  autonomous  directly.  How¬ 
ever.  after  all  nonessential  terms  have  been  removed  to  desired  order  using  Lie 


i 


transforms,  a  final  canonical  transformation  of  the  form 

J  -  / 

9  — ►  $  ■+■  at 

for  some  scalar  a  can  always  be  found  which  makes  it  autonomous. 

The  method  may  be  simplified  considerably  by  the  following  trick:  Apply 
a  canonical  transformation  which  removes  the  0(1)  terms  of  the  hamiltonian 
so  that  Ho  s  0.  Then  all  Poisson  brackets  in  (7)  involving  Hq  vanish,  and  the 
terms  which  are  removable  are  precisely  the  t-dependent  ones.  The  appropriate 
choice  of  Wi  is  to  take  -u>,  as  the  t-antiderivative  of  the  f-dependent  terms. 
The  resulting  K  is  autonomous  by  construction.  This  modified  Lie  transform 
algorithm  has  been  implemented  in  MACSYMA  since  the  amount  of  algebra 
required  to  carry  the  perturbation  scheme  through  even  0(e2)  is  too  daunting 
to  compute  by  hand  with  any  confidence.  The  program  and  sample  runs  are 
given  in  the  appendix.  For  a  further  discussion  on  the  use  of  computer  algebra 
in  perturbation  schemes,  see  [1],  [6],  and  [7]. 

While  the  simplification  of  the  algorithm  is  important  from  the  computer 
algebra  point  of  view,  it  is  perhaps  more  important  for  analytical  purposes. 
This  modified  method  was  used  to  determine  the  O(e)  and  0(c2)  resonances 
of  the  general  equation  given  previously. 

In  principle,  this  strategy  may  be  used  in  any  system  where  the  e  =  0 
problem  may  be  solved  exactly.  For  example,  a  system  of  linear  oscillators 
with  weak  nonlinear  coupling  may  be  studied  using  this  simplification. 

4  Determining  the  Resonant  Frequencies 

In  this  section  we  briefly  describe  the  procedure  by  which  resonances  may  be 
found  using  Lie  transforms.  For  complete  details,  see  [1]. 

We  assume  that  the  equation  is  of  the  form 

x  +  w2x  +  e^5r0(t)x;Va-1  =  0. 

a 

In  canonical  variables  q  and  p,  this  equation  gives  rise  to  the  hamiltonian 

h(q,pJ)  =  -+—-  +  (  (8) 

Our  first  step,  as  described  at  the  end  of  the  previous  section,  will  be  to  perform 
a  transformation  of  coordinates  to  a  system  in  which  the  hamiltonian  contains 
no  0(1)  terms.  The  change  of  variables 

q  =  \/2J /u>sin(0  +  u :t ), 
p  =  V‘2Ju!  cos(  9  +  u it) 


(9) 


reduces  eq.(8)  to  the  simplified  hamiltonian 


i  V  r-  UJ 


(10) 


We  now  apply  the  Lie  transform  procedure  to  transform  eq.(10)  into  an 
autonomous  hamiltonian.  Note  that  no  0(1)  terms  are  present.  As  noted  at 
the  end  of  the  previous  section,  this  simplification  permits  us  to  compute  the 
Lie  generating  function  at  each  step  by  integration  of  exponentials. 

To  identify  the  resonances,  we  first  compute  t»i  for  arbitrary  uj.  This  gives 
a  function  similar  in  form  to  H\  but  whose  coefficients  are  rational  functions 
of  uj.  The  poles  of  these  coefficients,  which  correspond  to  non-removable  terms 
in  H\(  J,  0,  t),  are  frequencies  which  are  resonant  at  0(e).  Having  identified  the 
O(e)  resonances,  we  may  then  implicitly  compute  W2  to  identify  possible  0(e 2) 
resonances. 

We  first  introduce  some  notation.  By  assumption,  each  ga(t)  is  periodic 
and  may  therefore  be  expanded  in  a  Fourier  series.  Let 

•S”  -  hC<’Mr"“dt 

be  the  Fourier  coefficients  of  ga(t )  for  integer  p.  Let  Ma  denote  the  set  of 
frequencies  of  ga(t),  that  is, 

Ma  =  {p:c[a)  #0,/zeZ}. 

Then 

9*(t)  =  E  cjfV* 

Expanding  the  trigonometric  functions  with  the  binomial  theorem  and  insert¬ 
ing  the  expansion  for  ga(t)  in  eq.(10)  gives 


N* 


where 


Hx(j,e,t)  =  EEE  ajiJe  ^  a^e  ^  ^ 

a  ee.V/o  v=0 


11) 


:i2) 


$ 

$ 


Proceeding  formally,  uq  is  just  the  negative  of  the  (-antiderivative  of  Hx : 

^2.  _a(i“)e*#(2*'-'va)e*<((2^-/VQ)w+M) 


i((2v-Na)uj  +  p) 

This  choice  of  wx  makes  h\  the  (-independent  part  of  Hx. 


/■ 


The  poles  of  w\  are  u>  =  0,  which  we  shall  ignore,  and 


g  v-  e  Ma 

Na-2u'  0  <v<Na. 

Let  fii  denote  the  set  of  0(e)  resonances.  Let  SI2  denote  the  set  of  poles  of 
W2  which  are  not  in  fli-  Then  CI2  the  set  of  0(e2)  resonances.  The  equation 
defining  W2  is 

=  2K2- {v)i,K\}  -  {wi,H\}.  (13) 


It  is  sufficient  to  determine  the  possible  exponents  introduced  in  the  right  hand 
side  of  eq.(13)  since  the  poles  of  W2  correspond  to  roots  of  the  exponents.  Since 
K\  and  /if 2  are  autonomous  by  construction,  the  new  resonances  can  come  only 
from  the  term  {wi,Hi}.  It  is  clear  from  the  definition  of  a{,®  that 


Therefore, 


8*$  _  (ft) 

8J  2  J  ' 


V  V  T'  ^  -  *Ng) 

-**)<>+») ' 

-yeMp 


,x6(2(v+6)-Na-N6) 


e»t({2{i/+6)-Na-Ng)w+ii+'r) 


(14) 


If  a  frequency  is  a  resonance,  then  it  is  a  root  of  the  t-dependent  exponential. 
It  is  easily  seen  that  Q2  consists  of  all  u>  which  satisfy 


u> 


M  +  7 

Na  +  N p  —  2  (v  +  6) 


H  G  Ma 
7  G  Mp 
0  <v  <Na 
0  <  6  <  N0 
6Na  /  uNg 


but  which  are  not  also  included  in  fii. 

We  conclude  this  section  with  some  examples  which  demonstrate  how  to 
compute  the  resonant  frequencies. 


4.1  Examples 

Example  1 


x  +  u2x  +  ex  cos (t)  =  0 


For  this  example, 

Nx  =  2,  Mi  =  {-1, 1}. 

fti  is  generated  by  the  numerators  ±1  and  denominators  2  —  2u  with 
v  =  0  or  v  =  2.  The  only  positive  resonance  is  u  =  1/2.  For  fi2>  the 
possible  numerators  are  1  +  1  and  —1  —  1.  The  denominators  are  given  by 
2  +  2  -  2(v  +  6)  where  u  =  0, 1,2  and  6  =  0, 1,2  with  u  ^  6.  Therefore  v  +  6 
can  take  on  the  values  1,2,  and  3,  and  consequently  the  allowed  denominators 
are  ±2.  The  only  frequency  generated  is  u  =  1.  Therefore 

n.  =  4>. 

Sij  =  <1}- 

This  agrees  with  the  classical  result  for  the  Mathieu  equation,  which  is  that 
the  0(en)  resonance  is  n/2. 

Example  2 

x  +  J1!  +  ex  cos(<)  +  ex3  =  0 

Then 

N1  =  2,  Afi  =  {—1, 1}. 

N 2  ~  4,  M2  —  {0}. 

fii  is  determined  exactly  as  in  the  previous  example  since  the  set  M2  cannot 
contribute  a  nonzero  frequency.  (The  0(e)  resonances  can  always  be  found  by 
considering  each  term  of  the  perturbation  separately).  For  Q2,  the  resonance 
w  =  1  is  generated  as  in  the  previous  example.  The  “mixing”  of  the  sets 
Mi  and  M2  introduces  the  possible  numerators  ±1  +  0,  with  corresponding 
denominators  2  +  4  -  2(i/  +  6)  where  v  =  0,1,2  and  <5  =  0,1, 2. 3, 4.  The 
forbidden  pairs  are  ( u,6 )  =  (0,0),  (v,6)  =  (1,2),  and  (1 \6)  =  (2,4).  The 
allowed  values  of  u  +  6  are  1,2, 3, 4, and  5,  giving  allowed  denominators  ±2  and 
±4,  so  the  new  resonance  is  w  =  1/4: 


n.  =  (j). 

n2  = 

Example  3 

x  +  uj2x  +  exn  cos(t)  =  0.  n  odd 
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The  0(e)  resonant  frequencies  are 


,1  5  „  22  r  „  22 
—  ^ 2  ’  3 ’ "5" ’ 3  ’ 

The  0(e2)  resonances  are 

97  34  17  93  34 

fi2  =  {34,23,21, 17, ^-,12,  ^,H, 10, if, ^,7^, 

z  o  zoo 

27131711342391721111710 

4’  2’  3’  2’  7’  5  ’  2  ’  4’  5  ’  **’  3  ’  5  ’  3  ’ 
13171151271210745  511 

4  ’  6  ’T’^T’?’2’  7  ’T’ 5’ PI’1’ 6’ 3’ 5'' 

5  The  Stability  of  the  Trivial  Solution  Near 
Resonance 

Having  identified  the  resonant  frequencies,  we  now  study  the  behavior  of  so¬ 
lutions  close  to  resonance.  We  first  study  the  major  qualitative  difference 
between  linear  and  nonlinear  parametric  excitation. 

We  consider  equations  of  the  form 

i  +  w2r  +  e/((,i)  =  0  (15) 

where  f(t, x)  is  periodic  in  t  and  strictly  nonlinear  in  x.  The  case  when  f(t,x) 
contains  terms  which  are  linear  in  x  with  periodic  coefficients  has  been  studied 
previously  [5]. 

Let  u>o  be  a  resonance,  and  take  w  in  eq.(15)  to  be 


w2  —  Wq  +  eui\  +  e2u>2  + 


Then  eq.(15)  becomes 


x  +  WqX  +  ( f(t,x )  -I-  (eu>i  +  e2w2  H - )x  =  0. 

Detuning  from  resonance  introduces  a  linear  t-independent  perturbation.  Since 
detuning  at  0(€n)  introduces  a  term  of  the  form  enJ  to  H ,  it  also  contributes 
a  term  to  Kn  which  is  independent  of  9  and  linear  in  J.  Since  a  nonlinear  term 
of  order  0(xm)  in  f(t,x)  contributes  terms  of  order  to  I\\  and 

terms  of  higher  order  to  subsequent  Ki,  the  stabilizing  effect  of  the  detuning 
will  dominate  in  a  sufficiently  small  neighborhood  of  the  origin.  (This  analysis 
requires  that  e  be  held  fixed,  while  J  may  be  taken  as  small  as  necessary.  For 
sufficiently  small  J  the  linear  term  dominates.)  This  implies  that  a  strictly 
nonlinear  periodic  perturbation  cannot  cause  the  trivial  solution  to  be  unstable 
away  from  resonance. 


6  The  Effect  of  Detuning  From  Resonance 


We  demonstrate  the  effect  of  detuning  from  resonance  on  the  equation 


x  +  u2x  +  ex 3  cos  t  =  0. 


The  resonances,  as  shown  in  a  previous  section,  are 


n2  =  {1,3}. 


The  MACSYMA  implementation  of  the  Lie  transform  algorithm,  which  is  listed 
in  the  appendix,  shows  that  for  u2  =  j  +  eui  +  e2uj2  the  0(c2)  the  reduced 
hamiltonian  is 


K  =  — e2J3cos40  +  4cji€2J2  cos  20  —  eJ2  cos  20 

Zt 


—  ^e2J3  +  uie2J  -  u\e2J  +  u\eJ. 

u 


The  fixed  points  satisfy 


8K 

dJ  =  °’ 


=  0. 


Solving  for  fixed  points  gives  the  0(1)  pairs  of  fixed  points 


_  uq  7uj2  -b  8w2  7 us3  -b  8u?iu?2  2 

T +  16  f  64  €  +  ‘ 


»  =  I,*i 
2  2 


for  uq  <  0  and 


U)\  7  LJ?  -}"  8u’2  /  CJ?  “b  8u^iU?2  9 

J  - - b  - 1 - £ - 1 - £  + 

2  16  64 


6  =  0,  x 


for  ui\  >  0.  (Solutions  which  are  0(l/e)  also  exist,  but  we  ignore  them  since 
we  are  interested  in  the  behavior  of  the  trivial  solution  a  neighborhood  of 
the  origin.  These  fixed  points  indicate  the  presence  of  elliptic  periodic  orbits 
contained  in  the  homoclinic  loops  of  the  0(1)  fixed  points.) 


;w 


& 


1 % 


|S‘b. 

Sft 

w. 


*|ib; 

i/V 

to; 


I 


V*> 

"1 

M* 


i 

m 

.2 


$ 


He 


i 
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We  now  classify  the  non-trivial  fixed  points  by  studying  the  hamiltonian  in 
a  neighborhood  of  the  fixed  points.  Below  resonance,  for  the  fixed  points  at 
9  =  7t/2  and  9  =  3tt/2,  the  hamiltonian  is 

K  =  Yg-  ^(24u>2f  +  17u>i  e  +  24uq )  cos  29  -  8u2(  -  19w2£  -  8uq)  J. 

Above  resonance,  for  the  fixed  points  9  =  0  and  9  =  tt,  the  hamiltonian  is 
K  =  —  Yg  ^(24it>2C  4“  17wj£  +  24uq )  cos  29  8u>2f  +  19w2€  +  8uq^  J. 

Both  translated  hamiltonians  represent  saddle  points.  As  u>  — ♦  |  the  saddle 
points  move  in  toward  the  origin  along  the  lines  cos  29  =  -1.  At  u  =  j, 
the  origin  is  saddle-like.  As  u  increases  from  1/2  the  saddle  points  move  out 
from  the  origin  on  the  lines  cos  29  =  1.  Figure  1  shows  Poincare  maps  below, 
at,  and  above  resonance.  The  Poincare  maps  were  generated  by  integrating 
the  second-order  equation  eq.(16).  Figure  2  shows  the  level  curves  of  the  re¬ 
duced  hamiltonian  (17),  plotted  on  the  same  scale  as  the  numerically  generated 
Poincare  maps. 


7  A  Bifurcation  Between  Resonances 

Finally,  we  consider  a  bifurcation  between  two  O(e)  resonances.  We  will  use 
O(e)  Lie  transforms  to  study  how  the  Poincare  map  changes  as  the  amplitudes 
of  two  perturbations  change.  The  equation  to  study  is 

x  +  (1  +  eu;i)x  +  €(sx2  cos (t)  +  (1  —  s)x3cos(2t))  =  0 

for  0  <  s  <  1.  When  s  =  0,  the  quadratic  term  is  absent  and  the  cubic  term 
is  resonant.  When  a  =  1,  the  quadratic  term  is  resonant  and  the  cubic  term  is 
absent.  When  0  <  a  <  1  both  terms  are  resonant.  The  interaction  of  the  two 
resonances  is  of  interest. 

Without  loss  of  generality,  set  uq  =  1.  The  resulting  reduced  hamiltonian 
is 

A'i  =  -J  +  sin9  -  - — ^J2cos29.  (18) 

2  4  4 

Fixed  points  satisfy 


dh\ 

dJ 

dh'i 

09 


=  0. 
=  o 
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0)  =  0.500798 


a)  =  0.5 


(c) 

03=  0.499199 


Figure  1 

Numerically  integrated  Poincare  maps  (Z:t=0  mod  2rr) 

of  equation  (16) 


U 


■vy. 


» 


Figure  2 

2 

rves  of  the  reduced  0(e  )  hamiltonian  (17) 
constructed  bu  Lie  transforms 


which  give  the  fixed  points 


(19) 

VJ  =  ±  \/9s2  +  32s  -  32) 

8(1  -  s) 

and 

sind  =  — - —  -v-p 

4(3  -  1)VT 

(20) 

T  8  —  8s  -  s2 

3  =  8(1  -3)2  ‘ 

The  requirement  that  the  radicand  of  eq.(19)  be  non- negative  restricts  s  to  the 
interval  0.8138  <  s  <  1.  The  requirement  that  |  sin0|  <  1  in  eq.(20)  restricts  s 
to  0  <  s  <  0.828427. 

We  now  classify  the  stability  of  the  fixed  points.  The  stability  of  the  crit¬ 
ical  point  is  characterized  by  the  sign  of  the  determinant  of  the  Hessian  H h 
evaluated  at  the  fixed  point.  A  rather  lengthy  computation  (cf.  )l])  shows  that 

det(H/i(xo))  =  hqqhpp  (/i^p)  |o 
=  hjjhffe  -  (hjg)2  |o- 


The  critical  point  is  a  saddle  if 

hjjhee  -  ( hjg )2  |o  <  0 

and  is  a  center  if 

hjjh$e  -  {hje)2  |o  >  0. 

For  the  fixed  points  satisfying  9  =  3x/2, 

( hjjhoe  -  hjs)|o  <  0 

gives  the  stability  criterion 

(i6(l  -  sfJ  -  lOv^U  -  s)V7  +  3s2)  >  0 
where,  from  eq.(19), 

7J  =  _^—(3s  ±  y9s2  +  32s  -  32). 
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Substituting  this  into  the  inequality  shows,  after  some  algebra,  that  the  limits 
of  stable  s  are  roots  of  the  polynomial  equation 


The  roots  in  the  interval  for  which  the  fixed  points  exist  are  found  to  be 
s  =  0.818337  (for  the  +  root)  and  s  =  0.88871  (for  the  —  root).  The 

stabilities  of  the  fixed  points  for  various  s  are  listed  in  a  table  below. 

For  the  other  fixed  points,  the  stability  criterion  is 


(1-s)2./2  (l-s)2J2  .  2 


sim(20)  >  0 


where 


1  —  3  8(1 -s)2 


n  .  a  sv2 

VJ  sinfl  =  — - — . 

4 (*  ~  1) 

Inserting  these  relations  gives  the  stability  criterion 

(s2  +  4s  -  4)(s2  +  8s  -  8) 

64(1  -  s)2  <  ’ 

Since  these  fixed  points  fixed  points  exist  for  0  <  s  <  0.828427  and  the  in¬ 
equality  is  not  satisfied  on  this  interval,  these  fixed  points  are  always  saddles. 
The  behavior  for  various  s  is  summarized  below: 

•  At  s  =  0,  saddles  exist  at  \fl  =  1,  9  =  0  and  V7  =  1,  9  =  7r. 

•  As  s  increases,  the  saddles  move  into  the  left  side  of  the  plane  and  toward 
the  horizontal  axis. 

•  At  s  =  0.8138,  a  center  appears  at,  \J~J  =  2.31718,  9  =  3x/2. 

•  As  s  increases,  the  centers  separate,  remaining  on  the  horizontal  axis. 


•  At  s  =  0.818337  the  center  farthest  from  the  origin  on  the  horizontal  axis 
becomes  a  saddle. 

•  At  s  =  2\/2  -  2  the  two  saddles  and  the  inner  center  coalesce  and  form 
a  center. 

•  At  s  =  0.88871  the  inner  center  becomes  a  saddle. 

•  As  s  -*  1  the  inner  saddle  moves  to  \/~J  =  v^8/3  and  the  outer  one  moves 
off  to  infinity. 

Figure  3  shows  the  transitions  for  various  values  of  s. 


Figure  3 


Level  curves  of  the  reduced  hamiltonian  (18)  for  indicated 

values  of  s. 
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8  Appendix 


The  following  MACSYMA  program  computes  the  lie  transform  of  a  weak  per¬ 
turbation  of  the  simple  harmonic  oscillator. 

/*  Program  to  compute  the  lie  transform  near  a  resonance.*/ 

/*  If  detuning  is  requested,  the  program  will  supply  it  */ 

/♦in  the  form  dw[i]*e“i.  */ 
lie() :=block 


w 


kill(y,dw,n, j ,dotran) , 
assume(j>0) , 
maperror: false, 
print(timedateO)  , 
trunc:read("Truncation  order:"), 
om:readC"Frequency") , 

f: read ("Perturbation  (use  x,  e,  and  t):"), 
detoon: read("Detune  from  resonance  Cy/n]  ?"), 
if  detoon  =  y  then  f:f+sum(e“i  *  dw[i] , i, 1 , trunc)*x, 
dotran:read("Compute  the  co-ordinate 

transformation  [y/n]  ?"), 
print ("Equation  to  work  with:"), 
print(’diff(x,t,2)  +  om"2*x  +  f  *  0) , 

/*  Construct  the  Hamiltonian  in  complex  */ 

/*  slow-flow  co-ordinates.  */ 

hh : map (pseudo. int_x, expand ( exponent ial ize (f ) )) , 


]«»«* 

m 


m 

ik 

life 
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/*  Do  the  canonical  change  of  co-ordinates  to  */ 

/*  slow  action-angle  variables.  */ 

hh:ev(hh,x='/,e‘('/,i*om*t)*q/(2*'/!i*om)  -  '/.e' (-’/,i*om*t) *p) , 
hh: ev(hh,q=sqrt(2*j*om) *’/,e“ (*/,i*th) , 

p=sqrt  (2*j  *om)  /  (2*7,i*om)  *’/,e‘  (-’/.i*th) ) , 

/*  Now  taylor  expand  hh  to  order  trunc  */ 

/*  and  assign  h[i]  values.  */ 
tmp:  expand(taylor(hh,  e,  0,  trunc)), 
for  i  from  0  thru  trunc  do 
( 

h[i]  :  coeff (tmp,e,i) 


»vr,i 


/♦  Initialize  the  new  hamiltonian.  */ 
k[0]  :  h[0]  , 

/♦  This  loop  does  the  transforms.  ♦/ 

for  n  from  1  thru  trunc  do 

( 

printO'Loop  #  ",n,  "of  ".trunc), 

temp:  h[n]  +  sum(poisson(w[n-m] ,k[m]) ,  m,  1,  n-l)/n, 
temp:  expand(temp  +  sum(m*inverse_evolution 

(n-m,  h[m]),  m,  1,  n-l)/n), 

/♦  We  don’t  need  w [trunc]  unless  we  are  going  to  ♦  / 

/*  compute  the  net  transformation.  ♦/ 

if  (dotran  =  y  or  n  <  trunc)  then  w[n] :  getw(n.temp) , 

/*  Cheat  here.  w[n]  was  chosen  to  make  k[n]  the  ♦/ 

/*  t-independent  part  of  temp.  ♦/ 
k[n]:  map (nuke_t, temp) 

), 

/*  The  result  is  in  new  action-angle  variables.  ♦  / 

/*  Tell  me  what  we  got.  ♦  / 

print ("") , 

kk:sum(k[i]+e"i,i,0, trunc) , 
kk : expand ( rat ( kk) ) , 

/*  Tell  me  all  about  the  reduced  hamiltonian.  */ 

print("The  reduced  hamiltonian  in  transformed 

action-angle  variables:"), 
realkk:  expand(rat(realpart(kk))) , 
print (realkk) , 

/♦if  requested,  compute  the  co-ordinate  transformation.  ♦/ 
if  dotran  =  y  then 
block 
( 

/*  use  the  inverse  evolution  operator  give  the  relation  */ 
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/*  between  old  and  new.  */ 

physical.} :sum(e~i*inverse_evolution(i, j) ,i,0,trunc) , 
physical_th:sum(e‘i*inverse_evolution(i,th) ,i,0,trunc) , 

physical.j :expand(realpart(physical_j)) , 
physical.th: expand(realpart (physical.th) ) , 

/*  Now  tell  me  how  big  the  trans format ion  is:  */ 

printC"), 

print ("The  co-ordinate  transformation  has  been 

computed. ") , 

print ("length(physical_j ) =" ,length(physical_ j ) ) , 
print ("length (physical. th)=" .length (physical.th)) 

) 

else 

print ("You  told  me  not  to  compute  the 

co-ordinate  transformation."). 


/*  Finished.  */ 

)$ 

/*  Function  to  look  like  integration  in  x.  */ 

/*  This  function  is  mapped.  */ 
pseudo_int_x(f ) := 

( 

f*x/(hipow(f ,x)  +  1) 

)$ 

/*  function  to  compute  poisson  brackets  in  (th,j)  space.  */ 
poisson(f ,g) := 

( 

diff(f.th)  *  diff(g.j)  -  diff(f.j)  *  diff(g.th) 

)$ 


/*  function  to  nuke  t-independent  stuff.  */ 
/*  this  function  is  mapped.  */ 
nuke.no.t(f) := 

( 

if  freeof(t.f)  then 
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o 

else 

f 

)$ 


/*  function  to  nuke  t-dependent  stuff. 
/*  this  function  is  mapped.  */ 
nuke_t(f ) :■ 

( 

if  freeof(t.f)  then 
f 

else 

0 

)$ 


*/ 


/*  Function  to  compute  generating  function  to  */ 

/*  nuke  t-dependent  terms.  This  function  is  not  mapped.  */ 
getw(n,f) := 


( 


[tmp] , 

tmp : expand (-n*map (nuke_no_t ,  f )  )  , 

/*  Factor  the  exponents  in  case  an  unspecified  */ 

/*  omega  is  given.  Note:  lambda  returns  a  list.  */ 
tmp:map(lambda( [u] ,scanmap(f actor, [u])) ,tmp) , 
tmp : part ( tmp, 1) , 
map ( innegrate , tmp) 

)$ 


/*  Function  to  look  like  integration  of  complex  */ 
/*  exponential,  hence  the  name.  */ 

I*  This  function  is  mapped.  */ 
innegrate (f) := 


( 


[nn,mm,tmp,z]  , 

matchdeclare( [nn,mm]  ,freeof(t)). 


/*  Define  the  pattern-matching  rules  for  sines  */ 

/*  and  cosines.  Note  that  the  rules  do  not  commute,*/ 
/*  and  ins  must  be  performed  before  inc.  */ 


defrule(ins,  sin(nn+mm*t) ,-cos(nn+mm*z)/mm) , 
defrule(inc,  cos(nn+nun*t) ,sin(nn+mm*t)/mm) , 
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tmp:expand(demoivre(f)) , 
tmp : expand (applybl(tmp, ins)) , 

tmp:applybl(map(nuke_no_t,tmp) ,inc)+map(nuke_t,tmp) , 

tmp:ev(tmp,z»t) , 

tmp : expand (exponentializa (tmp) ) 

)$ 


/*  Recursive  function  to  compute  kth  term  of  */ 

/*  inverse  of  evolution  operator  acting  on  h.  */ 
inverse_evolution(k,h) 

( 

if  k  *  0  then  h 
else 

sum(poisson(w[k-m] ,inverse_evolution(m,h)) ,  m,  0,  k-l)/k 

)$ 


/*  Recursive  function  to  compute  kth  term  of  evolution  */ 
/*  operator  acting  on  h.  Note  that  this  function  is  not  */ 
/*  used  by  the  program.  */ 
evolution(k.h)  :« 


( 


if  k  *  0  then  h 
else 

-sum(evolution(m,poisson(w[k-m] ,h)) ,m,0,k-l)/k 

)$ 


The  following  examples  were  run  with  the  MACSYMA  option  “SHOYV- 
TIME:ALL”  on  a  VAX  8500. 


(c4)  lie()$ 

Wed  Jun  10  15:59:17  1987 


Truncation  order: 

l; 

Frequency 

1/2; 

Perturbation  (use  x,  e,  and  t) 
e*x“3*cos(t) ; 

Detune  from  resonance  [y/n ]  ? 

y; 
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Compute  the  co-ordinate  transformation  [y/n]  ? 

y; 


Equation  to  work  with: 

2 

d  z  3  x 

-  +  e  cos(t)  x  +dw  e  x  +  -  »  0 

2  1  4 

dt 

Loop  #  1  of  1 


The  reduced  hamiltonian  in  transformed  action-angle 

variables: 


2 

dw  e  j  -  e  j  cos (2  th) 

1 

The  co-ordinate  transformation  has  been  computed. 
length(physical_j)=  5 
length(physical_th)=  6 

Totaltime=  52500  msec.  GCtime=  20116  msec. 

Next,  a  run  with  a  symbolic  frequency  to  show  how  the  resonant  frequencies 
may  be  computed: 

(cl)  lie()$ 

Wed  Jun  10  16:01:32  1987 

Truncation  order: 

1; 

Frequency 

omega; 

Perturbation  (use  x,  e,  and  t) : 
e*x‘3*cos (t) ; 

Detune  from  resonance  [y/n]  ? 
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Compute  the  co-ordinate  transformation  [y/n]  ? 

y; 

Equation  to  work  with: 

2 

d  x  3  2 

-  +  e  cos(t)  x  +  omega  x  *  0 

2 

dt 

Loop  #  1  of  1 


The  reduced  hamiltonian  in  transformed  action-angle 

variables : 

0 


The  co-ordinate  transformation  has  been  computed. 
length(physical_j)=  5 
length(physical_th)=  6 

Totaltime=  72200  msec.  GCtime=  26500  msec. 

(c8)  factor(denom(rat(ev(w[l] ,t=0)))) ; 

Totaltime=  3250  msec.  GCtime=  1266  msec. 

2 

(d8)  32  omega  (2  omega  -  1)  (2  omega  +  1) 

(4  omega  -  1)  (4  omega  +  1) 

The  poles  of  the  generating  function  at  0(e)  are  =  1/2  and 


26 


References 

[1]  J.L.  Len, Nonlinear  Parametric  Excitation  With  Averaging  and  Lie  Trans¬ 
form  Methods.  Ph.D.  Thesis,  Cornell  University,  1987 

[2]  John  R.  Cary,  Lie  Transform  Perturbation  Theory  for  Hamiltonian  Sys¬ 
tems.  Physics  Reports  (Review  Section  of  Physics  Letters)  79,  No.  2(1981) 
129-159 

[3]  Eugene  J.  Saletan  and  Alan  H.  Cromer,  Theoretical  Mechanics.  John  Wiley 
and  Sons,  1971. 

[4]  Herbert  Goldstein,  Classical  Mechanics.  Addison-Wesley,  Reading,  Mas¬ 
sachusetts,  1981. 

[5]  Leslie  Anne  Month  and  Richard  H.  Rand,  Bifurcation  of  4:1  Subharmonics 
in  the  Nonlinear  Mathieu  Equation.  Mechanics  Research  Communications, 
Vol.  9(4),  233-240,  1982. 

[6]  Richard  H.  Rand,  Computer  Algebra  and  Applications  in  Applied  Mathe¬ 
matics.  Research  Notes  in  Mathematics  94,  Putnam,  1984. 

[7]  Richard  H.  Rand  and  Dieter  Armbruster,  Perturbation  Methods,  Bifurca¬ 
tion  Theory,  and  Computer  Algebra.  Applied  Mathematical  Sciences,  Vol¬ 
ume  65,  Springer- Verlag,  New  York,  1987. 


27 


Knowledge  Representation  and  Planning  Control  in  an  Expert  System 
for  the  Creative  Design  of  Mechanisms 

David  A.  Hoeltzel 
Assistant  Professor 

Wei-Hua  Chi  eng 
Graduate  Research  Assistant 

John  Zissimides 
Graduate  Research  Assistant 

Laboratory  for  Intelligent  Design 
Department  of  Mechanical  Engineering 
Columbia  University 
New  York,  New  York  10027 

Abstract 

An  interactive  system,  referred  to  as  MECXPERT  {Mechanism  Expert},  has  been 
designed  with  the  expressed  purpose  of  assisting  nonexpert  design  engineers  in 
creating  mechanisms  for  fulfilling  specific  motion-conversion  and/or  power- 
transmission  requirements.  The  particular  knowledge  representation  chosen  for  this 
application  comprises  a  hybrid  formulation  of  a  rule-based  production  system  with  a 
frame-based  approach.  The  underlying  control  strategy  is  based  on  a  series  of 
special-purpose,  domain-specific  operators  whose  function  is  to  move  from  one 
problem  space  to  another  through  various  stages  or  "states"  that  comprise  the 
mechanism  design  process. 

The  primary  focus  of  this  paper  centers  on  the  representation  of  knowledge  and 
its  control  within  an  expert  system  for  creative  mechanism  design.  An  overview 
summarizing  the  reasons  for  developing  such  an  expert  system  is  provided,  and  the 
formulation  of  a  problem  is  discussed  through  an  example  taken  from  the  design  of  a 
variable-stroke  internal-combustion  engine. 

Introduction 

The  need  for  better  and  more  nearly  optimal  and  systematically  designed 
mechanical  devices  in  today’s  competitive  world  economy  necessitates  the 
development  of  expert  systems.  Capturing  an  expert’s  knowledge  and  heuristic  skills 
in  the  performance  of  a  domain  specific  task  are  the  goals  of  an  expert  system. 
Such  a  system  should  assist  less  experienced  engineers  in  producing  better  designs 

Note:  Bold,  italicized  works  appear  in  the  glossary  in  alphabetical  order. 
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in  a  timely  manner.  Towards  that  end,  an  expert  system  for  the  creative  design  of 
mechanisms  has  been  developed. 

This  paper  discusses  the  manner  in  which  knowledge  is  represented,  manipulated 
and  controlled  in  a  mechanism  design  expert  system,  an  important  first  step  in  the 
overall  development  of  the  system.  Expandability,  generality,  system  longevity  and 
efficiency  in  expended  effort  were  subjects  of  prime  concern  in  developing  the 
knowledge  representation,  with  an  eye  toward  long  term  committment  to  system 
improvement. 

Historically  there  have  been  three  approaches  to  the  conceptual  design  of 
mechanisms:  (1)  the  experience  of  a  designer  and/or  layout  draftsman,  (2)  the  use 
of  atlases  or  compendia  of  mechanisms,  still  the  most  widely  used  approach,  and  (3) 
the  investigation  of  the  kinematic  structure  of  mechanisms.  The  second  approach  has 
been  developed,  most  notably,  by  Jones  et  al.  [1]  and  Artoboleskii  [2]  and  makes  for 
interesting  and  informative  reading.  The  development  of  the  third  approach  is  more 
elusive,  but  holds  remarkable  promise  for  this  most  difficult  phase  of  mechanical 
design,  because  of  its  systematic  and  unbiased  nature.  The  expert  system  currently 
under  development  utilizes  the  later  methodology  as  a  basis  for  problem  formulation 
and  model  development  coupled  with  a  heuristic  approach  for  determining  structure- 
function  relationships  for  mechanisms  as  a  basis  for  what  we  refer  to  as  experience- 
based  mechanism  design. 

Few  studies  have  made  progress  of  any  significance  in  developing  expert  systems 
for  mechanism  design.  The  work  of  Kota,  Erdman  and  Riley  [3,4],  stands  out  as  the 
most  notable  for  applying  expert  system  techniques  to  the  design  of  dwell 
mechanisms.  The  authors  reported  on  progress  achieved  on  their  system  with  an  eye 
toward  future  development  of  a  more  general  system  whose  purpose  is  to  design 
mechanisms  capable  of  generating  straight  lines,  circular  arcs,  symmetric  curves 
and  parallel  motion,  in  addition  to  dwell. 

The  overall  process  of  creative  mechanism  design  can  be  separated  into  a  number 
of  steps,  some  possessing  considerable  levels  of  difficulty  and  requiring  significant 
long  term  mechanism  design  experience  and  intuition.  These  steps  are  depicted  in 
the  form  of  a  flow  chart  in  Figure  1.  The  creative  design,  i.e  type  and 
dimensional  synthesis,  of  mechanisms  is  a  complex  task  requiring  deep  domain 
knowledge  as  compared  with  the  knowledge  required  to  generate  routine  designs  for 
fulfilling  relatively  simple  or  previously  determined  motion  conversion 
requirements  or  to  redesign ,  through  minor  modifications,  existing  workable 
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designs. 

While  the  generation  of  numerical  solutions  corresponding  to  both  the  kinematics 
and  dynamics  of  a  known  mechanism,  as  well  as  its  animation  represent  relatively 
straightforward  processes,  creative  mechanism  design,  in  contrast,  is  extremely 
complex  requiring,  we  believe,  more  of  a  heuristic  approach  particularly  during  the 
more  conceptual  phases  of  the  mechanism  design  process, 

1  he  major  obstacle  to  overcome  in  creative  mechanism  design  centers  around  the 
determination  of  mechanism  topologies  (structures)  for  the  fulfillment  of  specific 
design  requirements,  i.e.  establishing  a  finite  definable  mapping  between  specified 
design  requirements  (functionality)  and  mechanism  structure(s)  capable  of  fulfilling 
the  design  requirements.  Such  a  mapping  will,  in  most  cases,  be  one-to-many.  Our 
work  in  this  area  centers  around  (i)  the  use  of  statistical  machine  learning  for  the 
cognitive  recognition  of  characteristic  motion  patterns  (functionality)  associated 
with  specific  classes  of  mechanisms  and  the  correlation  of  functionality  (function 
generation,  path  generation,  rigid  body  guidance)  with  structural  features  (links, 
joints  and  the  manner  in  which  they  are  connected)  embodied  in  the  mechanisms  and 
(2)  the  development  of  a  general  vocabulary  and  '‘language"  hierarchically  structured 
consistent  with  the  terminology  of  functional  requirements  and  structural 
characteristics,  through  which  a  mechanism  designer  can  establish  a  bi-directional 
channel  of  communication  with  the  system  in  order  to  convey  his  functional 
requirements  and  receive  feedback  in  a  manner  that  is  natural  to  mechanism  design. 
This  approach  can  be  looked  upon  as  a  heuristic  extension  of  the  concept  of  the 
separation  of  kinematic  structure  and  function  conceived  by  Freudenstein  [5]. 

Developing  an  Expert  System  for  Mechanism  Design 

It  is  our  contention  that  an  expert  system  for  mechanism  design  should  act  as  an 
intelligent  assistant  and  mentor,  guiding  the  design  engineer  during  the  creative 
process  of  mechanism  synthesis.  Furthermore,  the  primary  purpose  of  the  system 
should  be  to  fulfill  user-specified  predetermined  motion-conversion  or  power- 
transmission  requirements  through  the  creation  of  an  intelligent  interactive 
environment  with  the  mechanism  design  engineer. 

With  this  idea  in  mind,  the  system  under  development  has  been  fashioned  around 
the  concept  of  the  separation  of  kinematic  structure  and  function.  Figure  2A  depicts 
the  essence  of  this  subtle  but  important  concept  by  means  of  an  example.  The 
functional  requirements  of  the  spatial  slider  crank  mechanism  (converts  rotary 
motion  into  out-of-plane  reciprocating  motion)  are  provided  as  input-output 
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(functional)  specifications  by  the  user  while  the  structural  characteristics,  i.e.  those 
characteristics  which  will  fulfill  the  functional  requirements,  are  manifested  in  the 
actual  physical  embodiment  of  the  mechanism,  that  is  the  number  and  type  of  links 
and  joints  and  the  manner  in  which  they  are  interconnected. 

The  expert  system  system  has  been  logically  and  hierarchically  segmented  into 
the  following  four  subcomponents: 

1  .Specification  of  the  desired  kinematic  structural  characteristics  and  functional 
requirements. 

2. Determination  of  the  kinematic  structure  of  all  potentially  useful  mechanisms 
based  on  (1)  the  information  provided  in  step  1  and  (2)  statistical  machine 
learning  for  the  cognitive  matching  of  known  kinematic  topologies  with  the 
functional  requirements  the  topologies  are  known  to  fulfill. 

3. Screening  of  mechanisms  according  to  their  ability  to  fulfill  both  the 
functional  and  structural  constraints. 

4. Selection  of  the  most  favorable  mechanism,  i.e.  the  one(s)  most  nearly 
satisfying  the  constraints,  for  further  development  (analysis  and  animation) . 

As  demonstrated  by  Dobrjanskyj  and  Freudenstein  et  al.  [6]  and  Crossley  [7],  the 
kinematic  structure  of  a  mechanism  can  be  conveniently,  compactly  and  precisely 
represented,  mathematically,  using  linear  graph  theory.  Enumeration  of  the 
structure  of  mechanisms  coupled  with  subsequent  isomorphism  checking  for  the 
elimination  of  duplicate  mechanism  structures  using  link  connectivity  matrices 
provides  an  efficient  computational  scheme  for  representing  and  sorting  the 
kinematic  structure  of  candidate  mechanisms.  Figure  2B  depicts  the  graph 
representation  and  corresponding  link  connectivity  matrix  for  a  spatial  slider  crank 
mechanism.  As  previously  mentioned,  the  correlation  of  kinematic  structure(s)  with 
predefined  functional  requirements,  for  large  classes  of  mechanisms,  represents  the 
primary  bottleneck  to  creative  mechanism  design  and  is  the  area  in  which  much  work 
remains  to  be  done. 

Remaining  within  the  bounds  of  a  limited  domain  is  the  natural  and  most  logical 
course  of  action  to  be  adopted  in  any  new  expert  system  development  endeavor, 
particularly  in  a  domain  as  complex  as  mechanism  design.  As  a  result  of  this,  the 
time  tested  incremental  approach  to  software  design  has  been  utilized  [8].  In  this 
approach  a  top-down  building-block  strategy  is  employed  whereby  modular  pieces  of 
the  system  are  configured,  keeping  the  overall  system  configuration  in  mind  to  avoid 
costly  redevelopment,  and  incrementally  tested  to  insure  correct  results.  The 
system  is  presently  limited  to  planar  mechanisms  having  kinematic  pairs  with  a 


maximum  of  two  degrees-of-freedom. 

Finally,  in  designing  a  system  for  general  applicability  it  is  imperative  that  a 
test  problem  be  selected  that  reflects  the  complexity  of  the  domain  (mechanism 
design)  without  including  excessive  detail  or  problem  size  which  would  unnecessarily 
and  unavoidably  complicate  program  verification  and  performance  evaluation.  For 
this  reason  a  test  problem  fashioned  around  the  design  of  a  variable-stroke  internal- 
combustion  engine,  which  has  been  previously  solved  in  detail  by  Freudenstein  and 
Maki  [9],  has  been  selected.  The  designer’s  reasoning  processes  in  making  problem 
specific  design  decisions  have  been  explicitly  verbalized  in  their  paper. 

Software  Implementation  Issues 

In  its  present  state,  the  MECXPERT  system  has  been  implemented  using  the  OPS5 
production  system  programming  language  [10]  embedded  within  the  Knowledge  Craft 
expert  system  development  environment  [11].  Programs  developed  in  the  OPS5 
language  are  composed  of  data-sensitive  unordered  rules,  where  the  data  can  be  (1) 
instances  of  physical  objects,  (2)  facts  related  to  the  domain  of  application  and  (3) 
conceptual  objects  (such  as  goals)  related  to  the  problem-solving  strategy.  The  rules 
that  constitute  the  program  are  composed  of  two  parts.  The  first  is  the  condition 
part  and  consists  of  data  elements.  The  second  part  of  a  rule  is  the  action  part  and  is 
composed  of  instructions  that  change  the  current  data  configuration. 

Program  execution  occurs  in  "cycles"  in  which  each  cycle  consists  of  three 
actions:  match  rules,  select  matching  rules  and  execute  selected  rule.  A  rule  can  be 
executed  only  if  all  the  data  elements  in  its  condition  part  match  the  current  data 
configuration.  OPS5  provides  two  possible  strategies,  lexicographic  ordering  (LEX) 
and  means-ends-analysis  (MEA)  for  selecting  the  rule  to  be  fired  when  more  than  one 
rule  is  applicable.  In  this  case  the  MEA  conflict  resolution  strategy  was  selected 
because  it  places  additional  emphasis  on  the  recency  of  the  working  memory  element 
that  matches  the  first  condition  element  of  a  rule.  In  this  way,  when  the  first 
condition  element  of  a  rule  is  a  goal  element,  the  system  will  not  be  distracted  by  a 
very  recent  working  element  that  is  not  a  goal  (i.e.  goal  driven).  Thus  the  data 
configuration  changes  after  every  cycle  is  completed,  except  the  final  one.  For  this 
reason  the  system  can  be  said  to  use  a  data-driven  inference  strategy. 

In  this  system,  a  goal-driven  inference  strategy  is  inappropriate  due  to  the  fact 
that  in  the  process  of  mechanism  synthesis,  the  final  mechanism  topologies  suitable 
for  prescribed  motion  conversion  or  power  transmission  are  not  known  apriori,  but 
are  to  be  uncovered  through  the  interactive  design  process  embedded  within  the  expert 
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The  structure  of  the  software  has  been  developed  in  accordance  with  the 
requirements  specified  by  the  domain  by  developing  data  structures  that  insure  the 
creation  of  a  planning  strategy  capable  of  simulating  mechanism  design  procedures 
and  emulating  human  thought  processes  which  occur  during  mechanism  design  as 
these  would  be  performed  within  these  procedures.  The  procedures  use  various  types 
of  knowledge  to  implement  appropriate  reasoning  schemes.  It  is  therefore  of 
extreme  importance  to  effectively  represent  knowledge  and  simulate  planning,  since 
these  two  functions  determine  the  path  that  the  design  undergoes  and  whether  or  not 
all  facets  of  the  design  process  are  properly  taken  into  account. 

Building  a  Model  for  Knowledge-Based  Mechanism  Design 

In  order  to  develop  a  formal  description,  i.e.  model,  of  the  mechanism  synthesis 
problem  it  is  necessary  to: 

1. Define  a  state  space  representation  containing  all  the  possible  configurations  of  the 
relevant  parts  of  the  problem,  without  necessarily  enumerating,  in  detail,  all 
these  states.  In  fact  in  mechanism  design  this  represents  an  impractical  task  due 
to  the  NP-completeness  nature  of  the  problem,  i.e.  exponential  time  complexity 
growth  rate.  For  example,  the  graphs  corresponding  to  a  planar  six  bar 
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mechanism  represent  an  upper  limit  of  0(10  )  unique  kinematic  structures,  while 
those  for  a  planar  eight  bar  mechanism  represent  an  upper  limit  of  0(10  )  unique 
kinematic  structures.  This  later  number  of  possible  mechanism  structures  is  too 
large  to  undergo  detailed  development  or  in-depth  evaluation  given  the  present 
level  of  readily  available  engineering  computing  power.  This  presupposes  that  the 
problem  is  decomposable.  We  have  found  that  it  is. 

2. Specify  one  or  more  states  within  the  space  representing  possible  situations  from 
which  the  problem-solving  process  may  start.  These  are  the  initial  states. 

3. Specify  a  set  of  rules  describing  the  operations  which  permit  movement  through  the 
space  from  its  initial  state  to  its  goal  state. 

4. Specify  acceptable  solution  or  goal  states  to  the  problem.  In  mechanism  desien 
information  of  this  nature  would  be  provided  to  the  system  via  user  input  in  the 
form  of  (1)  an  input-output  function  to  be  generated,  (2)  a  description  of  position 
and  orientation  of  a  rigid  body  to  be  guided,  (3)  a  path  to  be  generated  through  a 
finite  number  of  points  by  a  point  on  the  coupler  link  of  the  mechanism  or  (4)  as  a 
power  transmission  or  energy  conversion  requirement. 
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Each  of  the  above  listed  parts  of  an  overall  system  model,  as  they  specifically 
relate  to  mechanism  design,  are  discussed  in  the  following  sections. 

Data  Structures 
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Appropriately  designed  data  structures  are  the  means  by  which  planning  and 
knowledge  representation  can  be  effectively  implemented  in  an  expert  system.  The 
problem  domain,  in  this  case  mechanism  design,  is  represented  or  broken  down, 
hierarchically,  into  problem-spaces  {PS}.  These  (PS}’s  represent  states  that  the 
system  can  reside  in  and  pass  through  in  its  effort  to  achieve  its  goal.  Thus,  the 
system  can  be  imagined  to  emulate  the  mechanism  designer’s  thought  processes, 
where  the  current  {PS}  represents  the  issue  or  concept  under  consideration.  After 
reaching  a  given  state  the  system  must  choose  the  next  state  to  which  it  will  move. 
To  achieve  this  a  data  structure  element,  referred  to  as  a  sub-problem  {SP},  has  been 
created  to  indicate  to  the  current  {PS}  what  the  next  available  states,  i.e.  {PS}’s, 
will  be.  Therefore,  within  each  {PS}  there  are  {SP}’s  which  represent  potential 
compatible  {PS}’s  to  which  the  system  can  move.  It  should  be  emphasized  that  the 
term  sub-problem  is  a  relative  one  in  the  sense  that  its  describes  the  next  possible 
problem-space,  {PS},  to  which  the  system  may  move  from  the  current  problem- 
space,  {PS},  thereby  establishing  a  parent-child  relationship  between  the  two. 

Operators,  {OP},  whose  function  is  to  decide  what  the  next  {PS}  is  to  be,  based  on 
the  current  status  of  the  design,  are  present  within  all  {PS}’s.  A  data  structure, 
referred  to  as  " compatibility ',  defines  the  compatible  {PS}’s  and  {SP}’s  and  is 
created  each  time  the  system  is  initialized.  In  essence  this  data  structure 
establishes  the  fixed  (common  for  different  design  goals)  graphical  tree  structure 
[12]  that  represents  the  domain  of  mechanism  design  (Figure  4),  to  the  extent  that  it 
is  represented  in  this  model.  This  structural  representation  is  possible  since  the 
design  domain  can  be  hierarchically  subdivided. 


The  hierarchical  nature  of  the  mechanism  design  domain  has  be  schematically 
illustrated  in  Figure  3.  Details  have  been  intentionally  omitted  at  this  point 
in  the  discussion  in  order  to  avoid  confusion,  however  following  sections  will 
elaborate  the  details  of  the  system  structure  specific  to  mechanism  design.  When 
the  current  {PS}  is  "1",  the  available  {SP}’s  will  be  "2",  "3"  and  "4".  If  {SP}  "3"  is 
selected  then  the  current  {PS}  becomes  "3"  and  the  subsequent  (SP}’s  are  "7"  and  "8". 
Four  different  defining  characteristics  can  be  associated  with  each  problem  space.  A 
{PS}  is  said  to  be  "complex"  when  in  order  to  be  solved  it  has  to  be  broken  down  into 
other  {PS}’s  (e.g.  {PS}’s  "1",  "3",  "4"  and  "9").  A  {PS}  is  said  to  be  "simple"  when  in 
order  to  be  solved  only  a  few  predetermined  steps  (actions)  are  necessary  (e.g. 
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{PS} ’s  "2",  "7",  "8",  "10",  "15",  and  "16").  Note  that  in  Figure  3,  circles  are  used  to 
schematically  represent  steps  rooted  in  simple  {PS}*s.  Therefore,  a  {PS}  can  be 
solved  either  by  successfully  executing  a  predefined  number  of  steps  (actions)  such 
as  eliminate  prismatic  (p-type)  joints  or  increase  the  current  number  of  independent 
loops,  Ljncj»  by  one,  or  by  solving  all  the  required  {SP}’s  (defined  within  each  {PS} 
by  an  operator  whose  role  is  to  evaluate  the  state  of  the  current  {PS})  that  are  rooted 
in  the  current  {PS}.  When  a  {PS}  is  successfully  achieved,  then  its  status  is  said  to 
have  been  "achieved".  When  a  {PS}  has  not  been  successfully  achieved,  because  its 
rooted  {SP}’s  and/or  steps  are  still  being  processed,  its  status  is  said  to  be 
"pending",  otherwise  its  status  is  said  to  be  "failed1  (within  each  {PS}  there  is 
knowledge  that  is  used  by  an  operator,  called  the  evaluate-state  operator,  to 
determine  the  status  of  the  {PS}). 

A  {PS}  is  said  to  be  "fixed1  when  the  choice  of  {SP}’s  is  independent  of  the 
current  design  assignment  specified  by  the  user,  and  thus  knowledge  can  be  given  to 
the  system  to  reject  all  but  a  single  {SP}.  Finally,  a  {PS}  is  said  to  be 
"probabilistic"  when  the  selection  of  a  {SP}  depends  on  its  probability  of  success. 
The  probabilistic  {SPJ’s  are  independent  of  one  another  and  each  carries  a  weighting 
factor  that  depends  on  problem  specifications  which  are  defined  by  the  user  during  the 
"problem  definition"  (entry  of  input  data)  phase  of  mechanism  design. 

In  Figures  3  and  4  probabilistic  problem-spaces  are  denoted  by  rooted  {SP}’s 
which  are  interconnected  with  their  respective  parent  {PSJ’s  by  dotted  lines  and  fixed 
problem  spaces  are  denoted  by  rooted  {SP}’s  that  are  interconnected  with  their 
respective  parent  {PS}’s  by  solid  lines. 

Making  reference  to  Figure  3  it  can  be  seen  that  when  the  current  {PS}  is  "1",  and 
if  and  only  if  {PS}  "1"  is  fixed,  then  the  following  knowledge  can  be  built  (hard 
coded)  into  the  system:  If  the  status  of  all  {SP}’s  directly  beneath  {PS}  "1"  are 
pending  then  reject  all  {SP}’s  except  "2".  If  the  status  of  "2"  is  achieved  and  the 
status  of  the  remaining  {SP}’s  on  the  same  level  are  pending,  then  select  "4"  by 
rejecting  the  other  pending  {SP}’s  on  this  level,  etc.  This  process  of  rejecting 
{SP}’s  is  realized  through  the  use  of  a  "reject"  operator  that  is  active  in  every  {PS}. 
In  the  first  case  of  the  above  example,  the  "reject"  operator  would  reject  {SP}’s  "3" 
and  "4"  and  in  the  second  case  it  would  reject  {SP}  "3".  Each  time  control  returns  to 
a  {PS}  from  a  lower  level  {SP}  (either  because  it  has  been  achieved  or  failed)  the 
status  of  {SP}’s  corresponding  to  the  current  {PS}  that  have  not  been  failed  or 
achieved  are  set  from  "rejected"  to  "pending"  so  that  they  will  be  available  when 
appropriate  and  necessary. 


Fixed: 

Probabilistic: 


The  primary  function  of  the  data  structures  described  above  is  the  computer 
implementation  of  a  systematic  methodology  for  mechanism  design.  The 
effectiveness  of  the  expert  system  depends,  to  a  large  extent,  on  the  manner  in  which 
explicit  knowledge  about  the  process  of  mechanism  design  can  be  built  into  the  data 
structure. 

Knowledge  Representation  Strategy 

The  corpus  of  knowledge  contained  within  the  expert  system  is  discretized  and 
partitioned  into  problem  spaces,  {PS}'s,  through  the  use  of  operators,  {OP}’s, 
possessing  certain  predefined  knowledge  roles.  These  operators  are  the  means 
through  which  a  planning  strategy  has  been  imparted  to  the  system.  The  operators 
and  their  associated  functions  have  been  created  as  the  means  through  which 
traversal,  from  one  problem  space  to  another,  can  systematically  and  consistently 
proceed  within  the  system.  Within  each  {PS}  the  operators  and  their  associated 
knowledge  roles  are  defined  as  follows  (Figure  5). 

Operator  Definitions: 

1.  Propose  operator: 

All  {SP}’s  applicable  to  the  current  problem-space,  {PS},  have  their  status  set  to 
"pending".  The  status  of  these  {SPj’s  are  determined  by  the  propose  operator 
whose  function  is  to  check  all  the  {PS}’s  for  potentially  compatible  {SP}’s. 

2.  Evaluate  constraints  operator: 

In  this  stage  only  {PS}’s  that  are  probabilistic  exist.  Under  this  operator  the 
expert  places  knowledge  that  checks  the  user’s  input  and  the  current  status  of  the 
design.  It  assigns  the  appropriate  weighting  factors ,  (wt),  indicative  of  the 
contribution  that  each  of  the  constraints  should  make  to  the  rooted  {SP}’s. 

3.  Assign  probability  of  success  operator: 

A  probability  of  success,  p(s),  is  assigned  to  each  of  the  {SP}’s  by  averaging  the 
weighting  factors  that  have  been  determined  under  the  evaluate  constraints 
operator. 

4.  Evaluate  state  operator: 

When  the  current  step  of  the  data  structure  is  "evaluate  state",  the  system  will 
attempt  to  match  the  current  configuration  of  the  data  with  condition  elements 
that  if  present  in  working  memory,  would  indicate  failure  or  success  of  the 
current  {PS}.  If  such  a  matching  occurs  then  control,  unless  otherwise  specified 
by  the  failure  handler',  returns  to  the  parent  {PS}. 

5.  Reject  operator: 

{SP}’s  that  have  been  proposed  by  the  propose  operator  but  which  are  forbidden 
due  to  the  presence  or  an  appropriate  piece  of  knowledge  are  rejected  by  this 
operator.  This  operator  is  used  in  fixed  {PS}’s  to  reject  all  but  one  {SP}. 
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Complex  operator  i 
selected  in  problem— apace  i-1 
creates  problem— apace  i 


PROBLEM-SPACE  i 


STEP  1:  Propose 
operators 


Problem-apnea 
^  fixed  ? 


STEP  2:  Evaluate 
constraints 


SUSP  3:  Determine 
acceptability  of 
proposed 
operators. 


Operator 

oxiats  with  status' 
failed  or 
''"^aohleved  ? 


Select  compatible  operators  and  determine 
whether  problem- apace  is  probabilistic 
or  fixed. 


Depending  on  problem— 3pace  assign  new 
■weighting  factors  to  constraints. 


Determine  the  total  contribution  of  all 
relevant  constraints  to  proposed  operators. 


Determine  average  weighting  factor 
of  each  proposed  operator. 


Check  if  problem- space  has  been 
achieved  or  foiled. 


If  problem-space  achieved  or  failed 
backup  to  previous  problem-space. 


STEP  4:  Evaluate 
state  of 
problem-space. 


STEP  6:  Rejeot 
operator. 


STEP  fl:  Choose 
operator 


STEP  7:  Apply 
operator 


Check  if  there  are  instances  that  require 
rejection  of  specific  operators.  Note  that 
if  problem-space  is  of  "fixed  type  then 
all  but  one  operators  will  be  rejected. 
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If  selected  operator  is  complex  then  set 
problem— spaoe  to  be  selected  operator 
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If  selected  operator  is  simple  apply 
predefined  steps  and  if  achieved  set 
step  to  be  "achieved— state". 


Problem— space  i+1 
achieved  or  failed 


Figure  5.  Knowledge  rolas  (operators)  defined  within  the  i^  problem-space 
are  partitioned  into  seven  steps. 


6.  Choose  operator: 

In  probabilistic  {PS}’s  the  {SP}  with  the  highest  probability  of  success  (i.e.  > 
cutoff  value)  is  chosen.  This  is  true  for  the  eliminate-joints  {PS}  where  more 
than  one  {SP}  might  have  a  weighting  factor  >  cutoff  value  and  thus  more  than  one 
{SP)  can  be  selected  (more  than  one  joint  type  rejected).  In  other  probabilistic 
{PS}’s  such  as  "get  L.^-min"  only  one  {SP}  must  be  selected.  Therefore,  after  the 

one  {SP}  with  the  highest  weighting  factor  is  selected,  applied  and  achieved,  the 
evaluate-state  operator  should  set  the  current  {PS}  to  an  achieved  status,  and 
control  will  then  return  to  {PS}  get-graph.  In  this  example,  what  was  described 
happens  because  when  a  {SP}  is  achieved  the  operator  under  control  is  evaluate- 
state.  In  fixed  {PS}’s,  the  single  {SP}  with  a  pending  status  will  be  selected. 


(ATLAS  "graph-id  <a> 

"degree-of-freedom  <b>) 


;  This  is  an  atlas  of  graphs 
;  each  of  which  has  a  unique 
;  id  number  <a>  and  a  dof  <b> 
;  associated  with  it. 


(GRAPH  "id  <a> 


'id  <a>  ;  Each  graph  has  structural 

;  variables  associated  with  it. 
#-of-independent- loops  <b>  ;  Number  of  independent  loops. 

ground-link  <d>  ;  Indicates  which  link  is  grounded, 

'input-link  <e>  ;  Indicates  which  link  is  the  input  link, 

'output-link  <f>  ;  Indicates  which  link  is  the  output  link. 

'#-of-P- joints  <p>  ;  Number  of  prismatic  joints. 

'#-of-R-joints  <r>  ;  Number  of  revolute  joints. 

'#-of-G-joints  <g>  ;  Number  of  gear  joints. 


'status 


<n>)  ;  Can  be  pending,  selected  or  rejected. 
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7.  Apply  operator: 

The  chosen  {SP}  is  applied  by  the  apply  operator.  If  it  is  complex  then  the  {SP} 
becomes  the  current  {PS},  otherwise  the  predefined  steps  of  the  appropriate 
simple  {PS}  are  executed. 

Figure  5  depicts  the  relationship  of  these  operators  within  the  ith  problem  space. 
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Attributes  that  are  descriptive  of  the  structure  and  function  of  mechanisms  such 
as  the  number  and  types  of  links  and  joints,  degrees-of-freedom,  number  of 
independent  loops,  etc.,  and  knowledge  that  describes  these  quantities  is  represented 
in  the  form  of  hierarchical  frames  [13].  Frames  make  it  possible  to  readily 
represent  objects  hierarchically  and  to  simplify  their  communication  control 
structure.  The  following  is  a  frame-based  knowledge  representation  of  an  atlas  of 
graphs,  written  in  the  OPS5  language,  corresponding  to  the  structure  of 
mechanisms.  These  representations  can  change  and  expand  as  the  system  grows. 
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(ADJACENCY-MATRIX  ;  Each  graph,  <a>,  has  an  adjacency 

;  matrix,  <array-id>,  associated  with  it. 

"graph-id  <a> 

"matrix  <array-id>) 


(MECHANISM 

;  The  values  of  a  mechanisms  attributes, 

;  given  below,  represent  the  current  design 
;  status.  These  values  are  used  to  determine 
;  the  graphs  to  be  enumerated  and 
;  evaluated. 

"dof  <a> 

"Lind  <b>  ) 

F  is  specified  by  the  user-input  to  the 
question,  #  of  inputs  and  #  of  outputs. 

This  is  defined  by  the  knowledge  found  in 
"get  current  Lind"  (Figure  2). 

(JOINTS  "type 

<a> 

Any  of  type  R,  P  or  G. 

"max-#  1 

<b> 

A  problem-space  must  be  created  at 
the  appropriate  level  which  determines 
the  maximum  number  of  <a>  joints  to 
be  used  within  any  loop. 

"max-#2 

<c> 

Again,  a  {PS}  must  be  created  that  will 
have  knowledge  of  the  maximum  number 
of  <a>  joints  to  be  used  in  the  current 
mechanism  design. 

"status 

<d> 

Either  rejected  or  pending,  which  is 
determined  in  the  eliminate-type  of 
joints  problem  space. 

"wt  <e>  )  ; 

* 

5 

The  weighting  value  assigned  to  the  (SP) 
in  the  eliminate-type  of  joints  problem- 
space.  To  be  used  later  (in  case  status  is 
pending)  to  help  evaluate  the  graphs  that 

;  use  joint  <a>. 

Note  that  letters  in  triangular  brackets  represent  the  values  of  variables 
associated  with  the  structure  of  mechanisms.  This  data  structure  stores  knowledge 
about  a  complex  element  (in  this  case  mechanism  structure)  in  a  hierarchical  format 
and  communicates  it  through  the  use  of  an  identification  (id)  number. 

In  addition  to  hierarchical  frames,  production  rules  serve  as  a  second  method  of 
representing  knowledge,  a  description  of  which  has  been  provided  under  the  section 
of  this  paper  entitled  Software  Implementation  Issues. 

Systematic  Planning  for  the  Control  of  Knowledge 


repetituous  computations  or  enumerations  of  mechanism  structures  [14]  for 
efficiency. 

The  planning  strategy  incorporated  within  the  system  has  been  designed  to  address 
the  following  issues: 

How  is  the  next  problem-space  {PS}  chosen  when: 

1.  The  current  {PS}  has  to  be  broken  down  into  {SP}’s  in  order  to  be  solved. 


2.  The  current  {PS}  has  been  solved  successfully. 

3.  The  current  {PS}  cannot  be  solved  (i.e.,  is  assigned  a  failed  status). 


4.  A  solution  to  the  current  {PS}  is  not  acceptable  at  other  levels  of  the  knowledge 
representation  hierarchy. 


$ 

I 


VA 


$ 


I 

ji 


ft 


The  first  question  refers  to  the  concept  of  a  complex  {PS}.  If,  in  addition  to  being 
complex,  the  current  {PS}  happens  to  be  fixed  then  as  already  mentioned  there  must 
be  predefined  knowledge  resident  in  the  system  to  indicate  what  the  next  allowable 
{PS}  will  be  (i.e.,  a  predefined  course  of  action).  User  specified  input  is  assigned 
(1)  weighting  values  over  the  range  of  values  of  0  through  1  in  decimal  increments 
corresponding  to  their  relative  importance  and  (2)  "degree  of  compatibility"  values 
corresponding  to  how  compatible  they  are  considered  to  be  with  a  given  {PS}  or  {SP}. 

If  the  current  {PS}  happens  to  be  probabilistic,  in  addition  to  being  complex,  then 
selection  of  the  next  {PS}  will  depend  on  weighting  factors  associated  with  (carried 
by)  the  next  level  of  {SP}’s.  The  overall  probability  of  success  of  a  given  {SP} 
depends  on  (1)  the  values  of  weighting  factors  assigned  by  the  user  for  input  that  is 
compatible  with  the  {SP},  (2)  the  number  of  inputs  that  are  compatible  with  the 
{SP},  (3)  the  degree  of  compatibility  of  the  {SP}  with  the  current  {PS},  and  (4)  the 
probability  of  success,  p(s),  of  the  {SP}  in  the  current  {PS}  (the  last  two 
compatibilities  are  defined  by  a  domain  expert).  The  possibility  of  dependence  of  a 
{SP}  choice  on  the  successes  or  failures  of  previous  {SP}’s  and/or  {PS}’s  is  also 
taken  into  consideration.  It  can  be  seen  that  inputs  provided  by  the  user,  representing 
specifications  that  must  be  satisfied  up  to  a  desired  predefined  degree  of 
compatibility,  are  used  to  appropriately  constrain  or  trace  the  path  taken  by  the 
design  process. 

The  following  example  demonstrates,  in  a  simplified  manner,  how  constraint 
propagation  has  been  implemented.  Referring  to  Figure  3,  {PS}  "4"  is  defined  to  be 
probabilistic.  Inputs  "a",  "b",  and  "c"  are  defined  to  be  compatible  with  {SP}  "9",  and 
inputs  "a",  "c",  and  "d"  are  defined  to  be  compatible  with  {SP}  "10".  Furthermore,  it 
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is  assumed  that  the  domain  expert  has  assigned  the  following  degree-of- 
compatibility  values  for  inputs  a,  b  and  c  with  respect  to  {PS}  "9":  doca//g  =  .7, 
d°cb/g  =  -8,  docc/9  =  .9,  and  for  inputs  a,  b  and  d  with  respect  to  {PS}  "10": 

^oca/10  ~  ^ocb/10  =  anc*  ^ocd/10  ~  ^  is  a^so  assume^  ^at  the  user  has 

entered  the  following  weighting  factors  for  the  inputs: 

wt  :  0.8,  wt,  :  0.6,  wt  :  0.3,  and  wt,:  0.9 
a  b  c  d 

Based  on  the  above  degree  of  compatibility  values  and  weighting  factor  values,  the 
weighting  factors  for  {SP}’s  "9”  and  "10"  will  be  respectively: 


{SP}  "9"  :  ((0.7  *  0.8)  +  (0.8  *  0.6)  +  (0.9  *  0.3))  /  3  =  .4367 

{SP}  "10":  ((0.8  *  0.8)  +  (0.7  *  0.6)  +  (0.9  *  0.9))  /  3  =  .6233 

Thus,  in  this  case  {SP}  "10"  would  have  been  selected  if  there  was  no  knowledge 
under  step  "8"  (Figure  3)  that  would  forbid  the  selection  of  {SP}  "10".  Note  also  that 
a  "cutoff  value"  has  been  established  for  each  probabilistic  {PS}.  In  order  for  a  {SP} 
to  be  selected  it  must  acquire  a  composite  weighting  factor  value  that  is  higher  than 
the  cutoff  value  assigned  to  the  current  {PS}. 

Finally,  if  the  {PS}  selected  is  simple  and  if  it  has  been  achieved  then  control 
returns  to  the  parent  {PS}.  In  the  event  of  failure,  if  recovery  is  possible  the  failure 
handler  will  take  over,  otherwise  control  is  returned  to  the  parent  {PS}  and  the  failed 
{PS}  will  be  assigned  a  "failed"  status.  As  was  previously  discussed,  in  every  {PS} 

there  is  knowledge  about  whether  the  {PS}  has  been  achieved,  failed  or  pending 

embedded  within  internal  {SP}’s.  The  failure  handler  will  only  take  over  when  the 
failure  occurs  at  a  simple  {PS}.  This  is  because  only  then  is  it  possible  to  precisely 
recommend  a  specific  plan  of  action  for  recovery.  The  failure  handler,  when 
activated,  keeps  track  of  the  {PS}  where  failure  has  occured  and  when  its  role  is 
completed.  In  this  way,  the  design  process  will  be  able  to  resume  at  the  point  that  it 
stopped.  The  failure  handler,  when  activated,  will  go  to  the  {PS}  that  is 
recommended  within  the  simple  {SP}  where  the  failure  has  occured  and  a  "parallel" 
process  will  take  place  until  the  design  is  re-established  at  a  desired  status.  The 
control  will  then  return  to  the  failed  simple  {PS}. 

Defining  a  Mechanism  Design  Problem  Within  MECXPERT  (Problem  Definition  Phase) 


automatic  sketching  with  animation  and  kinematic  and  dynamic  analyses  (Figure  1). 
These  utilities  provide  feedback  to  the  system  and  designer  as  to  the  applicability  of 
the  heuristically  chosen  mechanism.  If  necessary  an  iterative  design  procedure  can 
be  instantiated  or  an  alternative  design  may  be  chosen  by  making  appropriate  changes 
to  the  specified  design  specifications  and  constraints  to  be  satisfied. 

The  system  must  first  acquire  knowledge  about  the  problem  through  a  knowledge 
acquisition  facility.  In  this  stage  of  operation  the  system  attempts  to  acquire  as 
much  information  as  possible  from  the  user  so  that  the  design  can  be  constrained  and 
the  domain  pruned.  The  only  required  information  is  the  number  of  inputs  and  the 
number  of  outputs.  However,  from  a  practical  standpoint  additional  information 
must  be  specified  in  order  to  narrow  the  number  of  alternative  or  candidate 
mechanisms  to  be  further  studied.  The  additional  information  is  acquired  by  the 
system  in  a  hierarchical  manner  so  that  only  relevant  questions  need  be  asked. 

Most  questions  require  that  a  weighting  factor  be  specified  in  the  integer  range  of 
zero  to  one,  in  decimal  increments,  corresponding  to  a  certainty  factor.  If  the  term 
"explain"  is  entered  at  any  time  during  a  user  session,  instead  of  the  required 
answer,  a  help  facility  will  provide  a  detailed  explanation  of  the  current  question. 

At  this  stage  the  system  will  attempt  to  narrow  down  the  mechanism  design 
search  space  (domain)  as  much  as  possible  by  identifying  design  goals  that  can  be 
approached  in  a  more  specific  way.  This  is  necessary  since  the  number  of  potential 
mechanisms  for  different  design  specifications  obtained  from  the  heuristic  rules 
employed,  for  a  general  design  case,  would  most  likely  be  unreasonably  large.  This 
inefficiency  is  a  result  of  the  lack  of  knowledge  about  how  the  different  specific 
domains  and  sub-domains  (represented  as  {PS}’s  in  Figure  2)  that  constitute  the 
general  design  domain  relate  to  general  concepts  and  of  course  computer-based 
limitations  (memory  and  speed). 

The  following  is  a  list  of  representative  system  querys  requiring  user  input: 

1 .  Enter  the  #  of  inputs. 

2.  Enter  the  #  of  outputs. 

3.  Enter  the  type  of  mechanism. 

4.  Enter  the  name  of  the  function  to  be  generated  (ex.  straight  line  motion). 

5.  (Questions  that  will  specify  the  task  of  each  output): 

a.  Order  of  the  path  traced  by  each  of  the  outputs. 

b.  Output  link(s)  must  be  connected  to  a  prismatic  joint  (wt.  0->  1 ) . 

c.  Which  outputs  must  be  grounded  (wt.  0->  1 ) . 

6.  Which  input (s)  must  be  grounded  (wt.  0->l). 
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7.  Which  input (s)  must  be  sliders  (wt.  0->  1 ) . 

8.  Will  there  be  a  control  or  guidance  function  within  the  mechanism  (wt.  0->l). 

9.  Enter  the  maximum  number  of  links,  i 

’  max 

10.  Enter  the  minimum  number  of  independent  loops,  L^  . 

1 1.  Low  cost  design  (wt.  0->l). 

12.  Reliable  design  (wt.  0->i). 

13.  Ease  of  manufacturability  (wt.  0->l). 

14.  Speed  of  mechanism  (wt.  0->  1 ) . 

15.  Load  (wt.  0->  1). 


These  inputs  will  be  used  to  constrain  the  design  domain  by  means  of  the  constraint 
propagation  method  described  earlier. 

The  following  is  a  partial  listing  of  the  rules  that  will  be  used  by  the  "evaluate- 
constraints"  operator  within  the  "graph-evaluation"  problem  space  (Figure  4): 


Rule-1.  If  the  mechanism  is  a  path  generator  then  the  output  link  must  be  a  floating 
link. 


Rule-2.  If  the  mechanism  is  a  function  generator  then  the  output  link  must  be  in 
contact  with  ground. 

Rule-3.  If  there  are  more  than  two  slider  joints  in  any  single  loop  then  the  topology 
is  invalid. 


Rule-4.  If  there  is  a  need  for  a  guidance  or  control  loop  then  the  output  link  should 
not  belong  to  the  loop  that  contains  the  input.  This  implies  the  need  for 

^ind,min- 

Rule-5.  The  total  number  of  independent  loops  cannot  be  less  than  (the  required 
number  of  links  which  are  adjacent  with  the  ground  link)  -  1. 
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Rule-6.  For  the  purpose  of  simplifying  the  analysis  phase  of  mechanism  design, 

mechanisms  containing  at  least  one  independent  loop  enclosed  by  {3  +  (total 
number  of  dof’s)}  links  should  be  selected  for  evaluation  prior  to  those 
mechanisms  which  do  not  satisfy  this  rule. 


These  rules  will  assist  in  the  process  of  pinning  down  the  kinematic  structural 
parameters  during  execution  of  the  "get-graph"  (PS).  Also,  user  input  related  to 
load,  speed,  noise  level,  cost,  reliability  and  manufacturability  considerations  are 
used  by  the  "eliminate-joint-types"  (PS),  Figure  4,  to  reject  or  assign  preferences 
for  the  different  available  joint-types. 

When  the  problem  definition  (data  input)  phase  has  been  completed  the  planning 
strategy  imbedded  within  the  MECXPERT  system  chooses  the  next  design  phase, 
either  type  synthesis  or  dimensional  synthesis. 


v»v 


s  Phase  of  Mechanism  Design  within  MECXPERT 


At  this  level  two  independent  problem-spaces,  {PS}’s,  are  available: 
1.  Modification  of  an  existing  design  (Iterative  Redesign): 


This  level  assumes  the  existence  of  a  known  mechanism  topology  in  order  to  fulfill 
the  user  specified  design  requirements.  An  iterative  redesign  procedure  is  initiated 
where  changes  can  be  made  to  structural  characteristics  (link  lengths)  of  a  known 
mechanism  in  order  to  move  the  existing  design  closer  to  the  required  design  in  an 
incremental  fashion.  After  each  change  is  made  to  the  mechanism,  animation  and,  if 
desired,  dynamic  analysis,  are  performed  in  order  to  assess  the  effect  of  the 
changes  on  nearing  the  desired  mechanism  functional  requirements. 

2.  Systematic  type  synthesis: 

In  this  problem  space,  {PS},  the  system  will  first  compute  the  number  of  links 
and  joints  for  the  simplest  possible  mechanism,  i.e  the  one  having  the  minimum 
allowable  value  for  Lind.  This  is  because  the  goal  is  to  satisfy  the  functional 
requirements  in  the  simplest  way  possible.  It  will  then  choose  the  appropriate  non¬ 
isomorphic  graphs  of  kinematic  chains  from  an  atlas  stored  in  the  database. 
Finally,  all  possible  combinations  for  the  ground  link,  the  inputs  and  outputs  and  the 
types  of  joints  will  be  systematically  enumerated  from  the  non-isomorphic  graphs. 
After  this,  as  shown  in  Figure  4,  the  next  step  will  be  "graph-evaluation".  Heuristics 
will  be  used  to  assign  a  weighting  factor  to  each  of  the  graphs.  Graphs  with 
weighting  factors  greater  than  0.5  (arbitrarily  chosen,  but  tuning  of  this  parameter 
may  be  required)  will  have  a  chance  to  continue  on  into  the  analysis  phase,  where  the 
graphs  will  be  examined  in  accordance  with  their  priority  as  indicated  by  the 
weighting  factors. 

Dimensional  Synthesis  Phase  of  Mechanism  Design  within  MECXPERT 


The  dimensional  synthesis  phase  of  MECXPERT  has  been  subdivided  into  two 
problem-spaces,  (PS}’s: 

1 .  Automatic  Sketching: 

The  graph  representation  restrains  the  link  connectivity  in  mechanism  design. 
However,  a  mechanism  has  to  be  uniquely  defined  not  only  by  its  link  connectivity  but 
also  by  its  physical  dimensions.  The  technique  which  applies  default  link  lengths  and 
orientations  to  the  graph-to-mechanism  conversion  problem  is  usually  referred  to  as 
the  automatic  sketching  of  mechanisms.  In  addition  to  the  default  dimensions,  link 
lengths  and  orientations,  default  constraints  associated  with  mechanism  geometry 
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must  also  be  specified.  This  includes  (1)  arbitrarily  assigning  a  joint  position  to  be 
coincident  with  the  origin  of  the  selected  coordinate  system  and  (2)  arbitrarily 
assigning  a  horizontal  link  (this  is  usually  the  gound  link).  In  accordance  with  the  the 
number  of  degrees-of-freedom  possessed  by  the  mechansism,  additional  constraints 
must  be  specified,  equal  to  the  number  of  degree-of-freedom.  These  additional 
constraints  are  referred  to  as  pseudo-constraints,  and  they  determine  the  initial 
position  of  the  mechanism.  In  order  to  satisfy  all  the  constraints  for  automatic 
mechanism  sketching,  a  Newton-Raphson  iteration  scheme  has  been  adopted. 


2.  Mechanism  Animation  and  Automatic  Kinematic  Analysis: 


The  concept  of  the  loop  closure  equation,  referred  to  as  the  Freudenstein 
equation,  can  be  expanded  for  solving  the  kinematics  of  multiple  loop  mechanisms. 
As  a  result,  by  applying  this  new  equation  solving  strategy,  a  computationally 
efficient  divide  and  conquer  algorithm  has  been  developed  to  generate  closed  form 
solutions  for  97%  of  all  planar  eight-link  planar  mechanisms  and  56%  of  all  ten-link 
mechanisms  requiring  only  seconds  of  cpu  time.  This  approach  can  greatly  expedite 
the  analysis  phase  of  mechanism  design.  The  remaining  cases  can  be  solved  using 
traditional  numerically-based  techniques  such  as  the  Newton-Raphson  method. 


Demonstrative  Example  of  Plannir 


orations  within  MECXPERT 


Aspects  of  the  MECXPERT  system,  related  primarily  to  knowledge 
representation  and  planning,  have  been  discussed  in  detail,  while  touching  briefly  on 
issues  related  to  heuristically-based  systematic  type  and  dimensional  synthesis  and 
data  input.  Two  {PS}’s  will  now  be  examined  within  the  context  of  a  specific 
problem  in  order  to  demonstrate  how  planning  is  actually  carried  out  in  the  system. 

Freudenstein  and  Maki  [7],  employed  the  method  of  separation  of  kinematic 
structure  and  function  to  develop  a  variable-stroke  slider  crank  mechanism  for  the 
design  of  a  new  internal-combustion  engine.  The  problem  presented  in  their  paper 
will  be  used  to  demonstrate  the  sequence  of  planning  operations  which  can  occur 
within  the  "Design"  and  "eliminate-type  of  joints"  (PS}’s  (Figure  6). 

When  the  current  {PS}  becomes  "Design",  the  system  checks  to  determine  whether 
the  current  {PS}  is  fixed  or  probabilistic.  This,  along  with  other  information,  is 
stored  in  the  compatibility!  data  structures  that  are  created  prior  to  the  time  the 
system  enters  the  "start"  {PS}  phase,  i.e.  when  the  system  is  initialized.  Next,  the 
system  checks  the  contents  of  the  compatibility!  ’■'ecords  (in  the  current  {PS},  i.e 
the  "Design"  {PS},  there  are  three  of  them,  (1)  Problem  definition,  (2)  Problem 


When  program  starts: 

1.  Compatibilities  establishment  (Problem  space  to  problem  spaoe). 

2.  Control  passes  to  start  problem  space. 
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lisp  routines: 

1)  Enumerate  graphs 

2)  Check  isomorphism 


[~i~[  -  sequence  of  events 


Figure  6.  Sequence  of  events  for  Variable- Stroke  engine  mechanism 
design  problem. 
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relaxation  and  (3)  type  synthesis)  in  order  to  propose  compatible  {SP}’s. 

There  are  two  types  of  compatibility  elements  shown  below.  Compatibilityl 
indicates  compatible  {SP}’s  within  a  {PS}  whether  the  {PS}  is  fixed  or  probabilistic 
and  whether  the  {SP}  is  simple  or  complex.  Compatibility2  is  only  used  in 
probabilistic  {PS}’s  to  indicate  which  constraints  (associated  with  user  input)  are 
compatible  with  which  {SP}’s  in  the  current  {PS},  the  weighting  value  of  that 
compatibility  (assigned  by  rules  based  on  the  experts  knowledge  and  user’s  input)  and 
a  cutoff  value  that  indicates  the  minimum  composite  weighting  factor  value  that  a 
{SP}  must  have  in  order  to  be  selected.  Clearly,  with  only  minor  modification  to  the 
compatibility  records  it  is  possible  to  restructure  the  entire  tree  or  to  add  new 
{PS}’s.  This  would  have  been  a  difficult  task  if  a  procedural  language  had  been  used 
to  implement  the  system. 


(compatibilityl  "PS  Design  "SP  type-synthesis 

"type  complex  "typel  fixed) 

c  Comments: 

c  Type-synthesis  is  a  compatible  subproblem-space  of  the  problem-space  "Design", 
c  Type-synthesis  is  a  complex  subproblem-space. 
c  Type-synthesis  is  a  fixed  subproblem-space. 


(compatibility2  "PS  eliminate-type  of  joints  "constraint  speed 

"SP  eliminate-p  joints  "wt  <to-be-found-under-evaluate- 
constraints-operator>  "cutoff  <PS-dependent>) 

c  Comments: 

c  Speed  is  a  constraint  associated  with  the  eliminate-type  of  joints 
c  problem-space. 

c  Eliminate-p  joints  is  a  compatible  subproblem-space  of  the  eliminate- 
c  type  of  joints  problem  space. 

c  A  weighting  factor  value,  <wt.>,  associated  with  the  eliminate-p 
c  joints  problem-space  determines  if  p-type  joints  should  be  eliminated 
c  from  the  mechanism  design. 

c  A  cutoff  value,  <cutoff>,  indicates  the  minimum  weighting  factor  value 
c  that  the  eliminate-p  joints  problem  space  must  have  in  order  to  be 
c  selected  (i.e.,  not  rejected)  for  use  in  the  design  of  the  mechanism. 


(compatibilityl  "PS  eliminate-type  of  joints  "SP  eliminate-p  joints 
"type  simple  "typel  probabilistic) 

c  Comments: 

c  Eliminate-p  joints  is  a  compatible  subproblem-space  of  the  problem-space 
c  eliminate-type  of  joints. 

c  Eliminate-p  joints  is  a  simple  subproblem-space. 
o  Eliminate-p  joints  is  a  probabilistic  subproblem-space. 
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The  three  compatible  {SP}’s  corresponding  to  the  "Design"  {PS},  as  shown  in 
Figure  4,  will  then  be  created  and  set  to  a  "pending"  status.  Since  this  {PS}  is  "fixed" 
the  system  will  pass  over  steps  1  through  4  and  jump  to  step  5,  the  reject  operator 
(Figure  5),  and  will  look  for  knowledge  to  reject  all  but  a  single  {SP}.  One  of  the 
rules  that  performs  this  function  is  given  as  follows: 

(p  reject-knowledge-in-design- 1 

(goal  "step  reject-operator  "PS  Design) 

(SP  "PS  Design  "name  problem-definition 
"status  pending) 

(SP  "problem-space  Design  "name  {<>  problem-definition} 

"status  pending) 

— > 

(modify  3  "status  rejected)) 

This  rule  states  that  when  the  current  step  is  step  5,  the  reject-operator,  in  the 
"Design"  {PS},  and  when  the  "problem-definition"  {SP}  is  "pending"  then  reject  all  the 
{SP}’s  other  than  the  "problem-definition"  {SP}.  After  this  step,  the  system  will 
execute  step  6,  the  choose  operator,  and  choose  the  only  available  {SP}.  Thus,  the 
"problem-definition"  {SP}  is  set  to  a  "selected"  status.  Next,  the  system  will  move 
on  to  step  7,  the  apply  operator,  and  apply  the  chosen  {SP}.  After  the  "problem 
definition"  {PS}  is  executed,  its  status  will  change  to  "achieved".  During  the 
"problem-definition"  phase  the  system  will  query  the  user  and  associate  his  answers 
with  a  data  structure  called  constraint  as  follows: 

(constraint  "name  speed  "wt  <a>  "status  active) 
c  Comments: 

c  Speed  is  a  constraint  having  both  a  weighting  factor  value,  <a>,  and  a 
c  status  associated  with  it. 

After  this  {PS}  has  been  executed  different  constraints  will  acquire  an  active  status 
and  weighting  factor  values.  For  the  variable-stroke  engine  design,  typical  inputs 
would  be: 

1.  High  speed  (wt.  .9)  and  high  loads  (wt.  .9) 

2.  Low  noiseiness  (wt.  .8) 

3.  One  input  and  output  (wt.  1.0) 

4.  Rotary  input  (wt.  1.0)  and  slider  output  (wt.  1.0) 

5.  Control  function  within  the  mechanism  (wt.  1.0) 


As  soon  as  the  system  identifies  an  "achieved"  {PS}  it  will  backup  to  its  parent 
{PS}.  Thus,  control  will  return  to  the  "Design"  {PS}.  There  the  control  will  be 
assigned  to  step  4,  the  evaluate  state  operator  (Figure  5),  and  the  system  will 
determine  whether  the  built  in  evaluation  knowledge  for  the  current  {PS}  makes  the 
current  problem  "achieved"  or  "failed".  The  system  will  also  reset  all  the  previously 
rejected  {SP}’s  to  a  status  of  pending.  Since  the  successful  completion  of  the 
"problem-definition"  {PS}  does  not  insure  that  the  "Design"  {PS}  has  been  achieved, 
knowledge  in  step  5,  the  reject  operator,  will  decide  which  {SP}  will  be  rejected. 
The  system  at  the  current  status  will  reject  the  "problem-relaxation"  {SP}  since  it 
would  be  expected  to  select  the  "type-synthesis"  {SP}.  Once  the  "type-synthesis"  {SP} 
is  selected,  it  will  become  the  current  {PS}  and  the  procedure  shown  in  Figure  5 
will  be  carried  out  by  its  {SP}’s.  This  procedure  will  set  the  "eliminate-type  of 
joints"  {SP}  to  be  the  current  {PS}.  Up  to  this  point  the  status  of  the  executed  {PS}’s 
would  be  as  follows: 


Pending:  Start,  Design,  type-synthesis,  eliminate-type  of  joints. 
Rejected:  Tutor,  problem-relaxation  and  get-graph. 

Achieved:  Choose  phase,  Problem-definition. 


The  "eliminate-type  of  joints"  {PS}  is  probabilistic.  After  its  two  rooted  {SP}’s 
are  proposed,  the  operator  in  control  will  be  "evaluate-constraints".  Depending  on 
user  input,  knowledge  provided  by  a  mechanism  design  expert  will  compute  a 
weighting  factor  value  that  will  be  used  by  the  next  operator  to  compute  the  total 
degree  of  compatibility  of  each  {SP}.  Thus,  the  next  operator  will  use  the  expert’s 
assessment  and  the  weighting  factors  assigned  to  the  input,  by  the  user,  that  are 


compatible  with  the  current  {SP}  to  compute  the  degree  of  compatibility  of  each  of 
the  {SP}’s.  In  a  more  general  implementation  of  this  system  additional  {SP}’s  would 
exist  such  as  reject  cam  joints,  reject  spherical  joints,  etc.  For  the  design  of  a 
variable-stroke  engine  the  last  three  {SP}’s  would  have  the  highest  degrees-of- 
compatibility  since  for  high  speed  and  high  load  operating  conditions,  joints  having 
surface  contact  rather  than  line  contact  are  preferred  and  probably  required. 

Next,  the  system  will  check  for  the  existence  of  any  knowledge  that  would  make 
the  rejection  of  a  {SP}  necessary.  For  example,  in  this  design  case,  both  the 
"eliminate  r-joints"  and  "eliminate  p-joints"  {SP}’s  would  be  rejected.  The  eliminate- 
type  of  joint  {PS}  would  continue  to  select  and  execute  unrejected  {SP}’s  in  order  to 
make  certain  joint  types  available  to  lower  level  {SP}’s.  When  there  are  no  longer 
any  {SP}’s  having  weighting  factor  values  greater  than  the  cutoff  value,  the 
"eliminate-type  of  joints"  {PS}  will  acquire  a  status  of  "achieved"  and  backup  to  the 
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"type  synthesis"  {PS}.  In  the  "type  synthesis"  {PS},  the  "get-graph"  {SP}  would  then 
be  selected.  The  current  status  check,  relative  to  the  last  status  check,  of  the 
executed  {PS}’s  would  be  as  follows: 

Pending:  Type  synthesis,  get-graph. 

Rejected:  eliminate  r-joints,  eliminate  p-joints. 

Achieved:  eliminate-type  of  joints. 

The  goal  of  the  "get-graph"  {PS}  is  to  assign  values  to  parameters  that  define  the 
kinematic  structure  of  the  mechanism.  These  parameters  are: 

i.  F,  degree-of-freedom  of  the  mechanism.  This  is  determined  by  the  number  of 
inputs  required  to  drive  the  mechanism  as  well  as  the  number  of  required 
outputs,  specific  application  requirements  (i.e.  how  the  mechanism  is  to  be 
used)  ,  the  degree  of  complexity  of  the  mechanism  and  whether  or  not  the 
mechanism  is  required  to  be  adjustable.  This  information  is  acquired  from  the 


user. 


2.  L.  j,  number  of  independent  loops  in  the  mechanism.  This  variable  is  indicative  of 
the  degree  of  complexity  of  the  mechanism.  Its  value  is  determined  in  the  {PS} 
"define-current-Lind".  The  system  will  define,  based  on  heuristics,  a  minimum  and 
a  maximum  value  for  Lind,  starting  from  the  minimum  value  since  simplicity  is  a 
desired  property.  In  this  design  case,  Lincj  min  must  be  ^  2,  based  on  Rule-4 
of  the  "evaluate-graph"  {PS},  since  a  control  loop  is  required  to  vary  the  stroke  of 
the  output  link.  The  given  design  specifications  require  a  value  of  Lind=  3  ln 
order  to  provide  separate  input,  control  and  output  loops.  The  maximum  value  for 
Lind  could,  in  general,  be  determined,  for  example,  from  cost  and  compactness 
limitations,  as  well  as  from  input/output  requirements. 

3.  f.,  the  degree-of-freedom  of  relative  motion  permitted  by  the  ith  joint. 


4.  i,  the  number  of  links. 


5.  j,  the  number  of  joints. 


6.  X,  the  mobility  of  the  space  in  which  the  mechanism  operates.  X  =  3  for  general 
plane  mechanisms,  X  =  6  for  spatial  mechanisms. 


The  general  degree-of-freedom  equation  may  be  expressed  as  [5]: 

F  =  X(l  -  j  -  i)  +  i  f= 

:  —  4  * 


(1) 


i=l 


The  number  of  independent  loops  is  given  by  the  equation  [15]: 


L 


ind 


=  1  + j-i 

21 


(2) 
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Equations  (1)  and  (2)  can  be  combined,  into  the  following  equation: 


i/* = 


F  +  X  L. 


ind 


(3) 


Based  on  equation  (3)  with  Linj  =  3  (as  previously  discussed),  F  =  1,  and  X  =  3 
(for  a  general  plane  mechanism)  the  sum  of  the  degrees-of- freedom  for  all  the 
can  be  calculated: 


ir 


1  +  3  *  (3)  =  10 


(4) 


Since  high  load  carrying  capability  was  a  specified  design  requirement,  only  revolute 
(R)  and  prismatic  (P)  joints,  each  having  one  degree-of-freedom  (f.  =  1),  can  be 
included  in  the  design.  Based  on  this  information,  equation  (4)  yields  a  value  of  j  = 
10.  Rearranging  equation  (2),  the  number  of  links  can  calculated  as  follows: 


,=  1  +J'Lind  =  8 


(5) 


In  general,  equations  (2),  (3),  (4)  and  (5)  can  be  used  to  determine  values  for  j 
and  4,  depending  upon  the  values  selected  for  the  L.ncj,  F  and  X  structural 
parameters.  Their  values  would  be  chosen,  firstly,  to  achieve  the  simplest  possible 
design,  based  on  heuristic  knowledge  appropriate  to  their  selection.  Once  values  for 
i  and  j  are  known,  appropriate  graphs  can  be  enumerated  (labeling  the  graphs  in  as 
many  non-isomorphic  way  as  possible)  and  different  joint  types  can  be  assigned  to 
the  edges  of  the  graph  in  a  way  that  insures  the  satisfaction  of  equation  (3).  This 
has  been  implemented  in  a  LISP  routine  (Figure  6).  The  next  step  involves  the 
evaluation  of  the  graphs  in  the  "evaluate-graphs"  {PS}  and  the  assignment  of  an  index 
to  each  of  them  indicative  of  the  order  in  which  they  should  be  processed,  i.e. 
studied  in  greater  detail.  Additional  generic  (problem  independent)  rules  can  be 
established  to  assist  in  the  elimination  of  inappropriate  kinematic  structures  thereby 
further  pruning  the  size  of  the  mechanism  design  space. 

As  an  example  of  the  output  provided  by  the  system  to  the  user  on  the  Symbolics 
3640  AI  workstation,  figures  7A,  7B,  7C  and  7D  display  an  enumeration  of  the 
graphs  and  mechanism  schematic  diagrams  of  several  eight-link  planar  kinematic 
chains  corresponding  to  numbers  1,  2  and  3  in  group  1  and  number  9  in  group  3  of 
those  enumerated  by  Freudenstein  and  Maki  [7]  for  the  variable-stroke  engine 
mechanism  problem. 
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Figure  7A.  :  Graph  enumeration  and  schematic  drawing  for  an  eight-link  planar 
variable-stroke  mechanism  (group  1,  number  1;  Freudenstein  and 
Maki,  [7]). 


Figure  7B.  :  Graph  enumeration  and  schematic  drawing  for  an  eight-link  planar 
variable-stroke  mechanism  (group  1,  number  2;  Freudenstein  and 
Maki  [7]). 
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Figure  7C.  :  Graph  enumeration  and  schematic  drawing  for  an  eight-link  planar 
variable-stroke  mechanism  (group  1,  number  3;  Freudenstein  and 


eight-link  planar 

variable  stroke  mechanism  (group  3,  number  9;  Freudenstein  and 
Maki  [7]). 
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Eventually,  kinematic  and  dynamic  analyses  would  be  undertaken  for  performance 
evaluation  at  a  lower  level  of  the  design  process. 

Conclusions 

A  systematic  methodology  for  representing  knowledge  and  its  control  within  an 
expert  system  for  the  creative  design  of  mechanisms  has  been  presented.  Careful 
attention  to  the  implementation  of  the  control  strategy  for  the 
manipulation  of  knowledge  has  been  an  important  aspect  of  this  research  in 
anticipation  of  future  growth  of  the  MECXPERT  system.  The  conceptual  basis  for  the 
system  relies  on  the  separation  of  kinematic  structure  and  function.  An  example 
based  on  the  design  of  a  variable-stroke  engine  mechanism  serves  to  convey  the 
manner  in  which  information  is  imparted  to  and  manipulated  within  the  system  in 
an  effort  to  enumerate  potentially  viable  mechanism  designs. 
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Compatibility  data  structure: 

Defines  all  the  compatible  {PS}’s  and  {SP}’s  each  time  the  system  is  initialized 
(started). 

Constraint  propagation: 

The  process  of  establishing  the  compatibility  of  interconnected  elements,  in  this 
case  {PS}’s  and  {SP}’s,  within  an  expert  system. 

Creative  mechanism  design: 


The  process  of  solving  a  mechanism  synthesis  problem  for  which  no  prior,  proven 
solution  exists,  based  on  the  systematic  separation  of  kinematic  structure  from 


solution  exists,  based  on  the  systematic  separation  of  kinematic  structure  from 
function  and  employing  heuristics  where  applicable  for  the  selection  of  kinematic 
structural  parameters  in  order  to  narrow  the  mechanism  design  search  space. 

Cutoff  value: 


In  order  for  a  probabilistic  {PS}  to  be  selected  it  must  acquire  a  composite 
weighting  factor  value  that  is  higher  than  the  preset,  expert  defined,  cutoff  value 
assigned  for  that  {PS}. 

Data-driven  inference  strategy: 

The  search  for  new  knowledge  or  information  proceeds  from  known  data  to  a  final 
goal. 

Deep  domain  knowledge: 


Domain  specific  knowledge  acquired  over  years  of  experience  enabling  an  expert  to 
solve  difficult  problems  in  that  domain  that  cannot  be  solved  by  only  analytical  or 
numerical  methods. 


Experienced- based  mechanism  design: 

The  process  of  drawing  upon  knowledge  concerning  the  structure- function 
relationships  of  mechanisms  obtained  from  (1)  mechanism  design  experts  and  (2) 
handbooks. 


Failure  handler: 


Keeps  track  of  a  {PS}  where  failure  has  occured.  When  activated,  the  failure  handler 
will  go  to  the  {PS}  recommended  by  the  {SP}  where  failure  has  occured  and  initiate 
a  process  parallel  to  one  in  which  failure  occured  until  the  design  is  reset  to  a 


a  process  parallel  to  one  in 
desired  status. 


Goal-driven  inference  strategy: 

The  search  for  new  knowledge  or  information  proceeds  from  the  goal  to  be  achieved, 
backwards,  towards  the  known  data. 

Help  facility: 

A  facility  provided  within  MECXPERT  which  provides  tutoring  and  advice  to  a  user 
concerning  the  meaning  and  use  of  system  commands.  It  can  be  initiated 
through  the  user  specified  system  command  word  "explain". 

Inference  mechanism: 


An  interpreter  that  determines  how  to  apply  the  rules  in  the  knowledge  base  to 
infer  new  knowledge  and  the  order  in  which  these  rules  should  be  applied  in  an 
expert  system. 


Instance: 

A  variable  whose  value  has  been  specified,  i.e.  instantiated. 
Knowledge  acquisition: 


The  process  of  acquiring  knowledge  about  a  specific  area  or  domain  (in  this  case  j 
mechanism  design),  from  various  sources,  in  order  to  bring  this  knowledge  to  bear  | 
on  a  narrow  domain  of  difficult  problems.  i 


Knowledge  base: 

The  collection  of  knowledge,  typically  in  the  form  of  facts  and  rules,  about  a 
specific  domain  (in  this  case  mechanism  design)  to  be  used  for  decision  making  in 
an  expert  system. 

Knowledge  roles: 

Knowledge  and  the  action  which  it  can  impart  are  stored  in  data  elements  referred 
to  as  knowledge  roles. 

Mechanism  synthesis: 

The  process  of  selecting  the  type,  arrangement  and  number  of  links  and  joints  in  a 
mechanism  for  the  purpose  of  fulfilling  predetermined  motion  conversion  or  power 
transmission  requirements. 

Operators: 

Data  elements  whose  function  permits  the  representation  and  control  of  knowledge. 
Problem-space: 

Represents  the  issue  or  concept  currently  under  consideration.  These  are  the  states 
that  the  system  can  reside  in  and  pass  through  in  its  effort  to  achieve  its  goal. 

Problem  space  status: 

The  status  of  a  problem  space,  {PS},  can  take  on  one  of  three  possible  values:  (1) 
Pending,  (2)  Achieved  and  (3)  Failed.  These  are  described  in  the  text  of  the  paper. 

Routine  design: 

A  design  problem  for  which  a  proven  solution  methodology  already  exists  and  for 
which  the  design  variables  are  known. 

Redesign: 

The  process  of  changing  an  existing  design,  based  on  proven  techniques,  in  order  to 
comply  with  different  design  requirements. 

Subproblem-space: 

A  subproblem -space,  {SP},  represents  the  next  available  problem  space. 

Weighting  factor: 

Indicates  the  degree  to  which  each  of  the  constraints  contributes  to  each  rooted 
(SP).  The  weighting  factors  are  denoted  as  wt.,.,  the  degree  to  which  the  ith 

nnncf  r»£i  i  r»t  onntrihi  itec  tn  tho  ith  / 


constraint  contributes  to  the  jth  {SP}. 
Working  memory  element: 


A  data  element  that  resides  in  the  working  memory  portion  of  program  memory 
during  program  execution. 
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Abstract 

In  order  to  overcome  the  problem  of  lack  of  generality  in  nonlinear 
programming  (NLP)  test  problem  formulation  and  to  introduce  the  concept  of 
cognitive  NLP  method  switching,  statistical  machine  learning  has  been  applied  to  a 
sample  data  base  of  nonlinear  programming  problems.  Reasonable  conclusions  have 
been  drawn  about  an  optimization  problem  type  and  a  corresponding  sequence  of  NLP 
solution  algorithms,  using  statistical  pattern  recognition  applied  to  local  (vs.  global) 
design  information.  A  program,  referred  to  as  OPTDEX-OLDM,  with  the  capability 
of  learning  from  statistical  pattern  recognition  is  discussed.  The  statistical  aspects 
and  algorithmic  optimization  of  the  nonlinear  programming  problem  are  emphasized 
in  this  discussion.  A  clustering  process  has  been  performed  on  attributes  assigned 
to  the  NLP  problem  sample  data  base,  and  an  example  which  describes  this 
statistical  clustering  process  is  discussed. 

Introduction 

Numerical  optimization  techniques,  in  the  form  of  nonlinear  programming  (NLP) 
algorithms,  have  been  applied  extensively  to  critical  structural  design  and  analysis 
problems  for  more  than  30  years  [1],  and  to  a  lesser  extend  to  mechanical  design 
problems  [2]. 

The  nonlinear  programming  problem  considered  here  takes  the  form, 

Minimize:  F (X) 

Subject  to: 


gj  ( *  )  £  0 

hk  (  X  )  =  0 

X1  <  X.  <  xu 

1  1  1 


j  =  i,  m 

k  =  1 ,  1 
i  =  1 ,  n 


\  Xi] 

where  X  =  !  X2 
I  : 

I  X 
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objective  function 

■§ 

inequality  constraints 

'.R 

ft 

equality  constraints 

•Ey 

ft! 

side  constraints 

■Hyf 

design  variables. 

Numerical  optimization  provides  a  systematic,  rational  and  directed  approach  to 
design  decision  making  where  previously,  heavy  reliance  was  placed  on  the 
experience  and  intuition  of  the  designer  in  achieving  an  improved  design.  Due  to 
complexities  involved  in  the  implementation  of  NLP  algorithms,  several  researchers 
have  undertaken  performance  analyses  [3,4,5,],  the  purpose  being  to  determine 
correlations  among  the  design  problem  type,  the  numerical  optimization  method  and 
the  corresponding  results.  Based  on  such  studies,  it  is  anticipated  that  the  novice 
user  should  be  able  to  better '  understand  the  capabilities  of  existing  optimization 
methods  and  furthermore,  utilize  them  without  the  need  to  undertaken  exhaustive 
programs  for  testing  and  learning.  While  in  concept  this  appears  to  be  a  rational 
approach  to  ascertain  the  capabilities  of  a  particular  algorithm  for  a  specific 
problem,  in  reality,  Himmelblau  [6]  states  that  "a  guarantee  of  convergence  for  on 
algorithm  for  special  cases  may  offer  little  insight  as  regards  satisfactory 
strategies  for  more  complex  problems". 

An  optimization  process  invariably  involves  a  trade-off  between  reality 
(completing  and  understanding  the  search  process)  and  economy  (evaluating  a  limited 
number  of  test  functions).  A  process  referred  to  as  statistical  concept  learning1  is 
introduced  to  compensate  for  this  trade-off.  Based  on  a  well  organized  data 
hierarchy,  concept  learning  has  been  developed  to  eliminate  unwanted  knowledge 
which  may  occur  due  to  noisy  data2  [7]  and  a  scheme  for  generalization  of  the 
statistical  results  has  been  developed. 

Method  Switching  Strategies  in  Nonlinear  Optimization 

Existing  algorithms  for  nonlinear  programming  which  have  been  surveyed  [8,9] 
may  converge  to  local  optima  which  are  not  necessarily  global  optima.  Many 
techniques  for  locating  global  optima,  aside  from  knowing  which  method  is  the  best 
first  method  have  yet  to  be  uncovered.  Method  switching  strategies  are  based,  by 
analogy,  on  the  game  of  golf3  rather  than  on  the  use  of  a  one  step  optimization 
scheme.  This  method  switching  procedure  is  designed  to  be  one  level  higher  than  the 
so  called  optimization  strategy  level  [10]  (monitors  the  numerical  optimization 

1  Statistical  concept  learning:  Learning  about  new  concepts  by  using  given  statistical 
measurements. 

2  Noisy  data:  A  small  amount  of  data  contradicting  the  conclusions  which  are  agreed 
upon  by  a  majority  of  the  remaining  data.  In  other  words,  data  lying  outside  any  of 
the  defined  cluster  groups  (Figure  1.). 

3  Game  of  golf  analogy:  The  reason  for  method  switching  is  in  accordance  with  the 
local  geographical  design  information  at  the  numerical  optimum,  and  is  analogous  to 
the  reason  for  selecting  an  appropriate  golf  club,  in  the  game  of  golf,  to  strike  the 
the  ball. 
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Frequency  of  ignition  timing  error 


Figure  1.  A  clustering  example. 

1.  Group  #1  (low  rpm)  and  Group  #3  (high  rpm) 
cause  more  ignition  failures. 

2.  Those  points  which  are  not  enclosed  within 
any  of  the  groups  have  been  referred  to  as 
noisy  data. 
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process)  and  switches  suitable  numerical  method  combinations  according  to  local 
design  data. 

For  example,  in  the  following  problem  containing  a  single  objective  function, 
design  variable  and  constraint: 

Minimize  the  design  objective  function:  COS  (x/1000  -  5.) 

Subject  to  the  design  constraint  function:  400.  -  x2  ^  0. 

with  design  variable  bounds:  0.  ^  x  ^  7500. 

The  following  cases  may  possibly  occur, 


case  1:  When  |x|  >>  20,  the  local  information  indicates  that  the  design  constraint  is 
inactive. 

case  2.  When  x  -  5000,  the  local  information  indicates  that  the  objective  function 
can  be  linearized  to  a  polynomial  of  degree  2,  which  is  1  -  (x/1000  -  5)2/2. 
case  3.  When  |x  -  5000 1  <  10,  the  local  design  information  indicates  that  the 
objective  function  can  be  linearized  to  a  polynomial  of  degree  4  by  using  an 
approximation  of  a  Taylor  series  expansion, 
case  4.  When  x  ^  7500,  one  more  design  constraint  is  added  from  the  design 
bounds,  which  can  be  expressed  as  x  ^  7500. 

This  example  demonstrates  that  local  design  information  can  change  in  various 
ways  when  the  updated  state  of  the  design  variables  (position)  is  altered.  Method 
switching  strategies  are  based  on  this  phenomenon  and  may  be  likened  to  a  monitoring 
or  blackboard4  style  decision  making  process.  Method  switching  keeps  track  of  the 
local  optimization  information  and  switches  methods  when  the  current  method  fails. 

According  to  the  schematic  representation  depicted  in  Figure  2,  the  first  design 
starting  point,  PI,  lies  in  an  infeasible  design  region  and  is  far  away  from  the 
globally  optimal  point.  A  temporary  goal  may  be  expressed  as  "move  the  design  into 
the  feasible  region  as  soon  as  possible"  to  increase  the  design  efficiency.  When  the 
design  "converges"  at  a  local  optimum,  P2,  current  NLP  methods  fail  to  move  away 
from  this  point.  In  accordance  with  the  local  information  found  in  the  vicinity  of 
P2,  the  method  switching  manager  pins  down  another  temporary  goal  which  may  be 
stated  as  "find  a  feasible  design  with  a  smaller  objective  value".  Method  switching 
terminates  when  the  convergence  criteria  have  been  satisfied.  This  is  usually  based 
on  (1)  a  cpu  time  consumption  limitation,  (2)  the  number  of  algorithm  iterations  or 
(3)  relative  or  absolute  difference  between  successive  values  of  the  objective 
function. 

4Blackboard  architecture:  A  model  in  which  all  intermediate  messages  and  results 
are  displayed  to  the  user  and  stored  in  a  common  area,  called  a  bloctsooara. 
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PI 


> —  :  Design  constraints. 

J — 1 _ I _ I _  :  Design  side  constraints  (design  bounds). 

®  :  Global  optimum. 

The  closed  curves  are  isoclines  of  the  design  objective 
function. 

PI,  P2,  P3  are  intermediate  starting  points  for  searching. 


Sample  Problem  Testing 

Fifteen  different  attributes  have  been  chosen  to  characterize  the  test  of  a  sample 
problem.  The  sample  problems  can  be  separated  into  three  domains: 

1.  The  Design  Problem  Type  -  contains  8  parameters,  including  the  number  of  design 
variables,  the  number  of  total  design  constraints,  the  number  of  equality  design 
constraints,  the  number  of  active  inequality  design  constraints,  the  maximum 
(positive)  order  of  testing  polynomials,  the  minimum  (negative)  order  of  testing 
polynomials  and  the  function  evaluation  cost  for  one  design  function  evaluation. 

2.  The  Choice  of  Nonlinear  Programming  method  -  contains  3  parameters,  which 
according  to  the  ADS  numerical  optimization  library  [10]  are  strategy,  optimizer 
and  one  dimensional  search  method. 

3.  The  Performance  of  the  Result  -  contains  4  parameters,  including  the  minimum 
objective  value  reachability,  the  design  constraint  violation  condition  and  the 
maximum  distance  of  search. 

The  set  of  test  problems  for  the  learning  program  have  been  produced  by  a  random 
function  generator  (Figure  3),  which  randomly  selects  a  problem  type,  and  in 
accordance  with  the  selected  problem  type  generates  the  objective  function  and  the 
design  constraint  equations.  These  polynomials  can  be  thought  of  as  local  information 
in  real  world  design  problem  formulations  since  many  functions  can  be  expressed  in  a 
Taylor  series  expansion.  Nonlinearity,  discontinuity  and  differentiability  can  be 
altered  by  appropriately  adjusting  the  order  of  the  polynomials. 

After  implementing  these  concepts  using  the  ADS  numerical  optimization  library, 
design  problems  have  been  tested  by  a  number  of  method  combinations,  which  have 
been  randomly  selected.  The  authors  have  generated  approximately  10,000  samples 
with  results  using  an  IBM  PC/AT  microcomputer.  These  results  have  been 
subsequently  analyzed,  using  statistical  machine  learning  concepts  incorporated 
within  a  program  referred  to  as  OPTDEX-OLDM  {Optimum  Design  Expert- 
Optimization  Level  Design  Manager),  on  a  Symbolics  3640  AI  workstation. 


Clustering  and  Associated  Statistics 

Every  sample  inherently  has  several  attributes,  which  include  the  characteristics 
of  the  design  problem  type,  the  category  of  the  nonlinear  programming  method  and  the 
corresponding  result.  All  of  these  attributes  are  represented  quantitatively  and  some 
of  them  are  noisy,  i.e.  unreliable.  To  minimize  the  noise  factor,  a  "variance"  type  of 
analysis  [4]  has  been  employed. 

Clustering  techniques  [13,  14,15,16]  are  used  to  find  groups  of  samples,  whose 
common  characteristics  have  not  been  predefined.  The  aim  is  to  subdivide  the 


Random 

function 

generator 

0-30 


Random  generation  of 

1.  No.  of  design  variables. 

2.  No.  of  design  constraints. 

3.  Characteristics  of  design 
constraints  (based  on  a 
polynomial  representation). 

a.  Maximum  order  (+)• 

b.  Minimum  order  (-). 

4.  Design  objective  (using 

the  same  procedures  as  3.). 


To  machine  learning  stage. 


Figure  3.  Flow  control  of  random  sample  generation  and  testing. 


available  samples  into  a  relatively  small  number  of  groups,  based  on  the  statistical 
behavior  of  the  different  attributes. 

The  clustering  analysis  involves  the  following  concepts: 

1.  Scaling:  Tranforms  the  real  world  value  of  each  attribute  into  a  machine 
understandable  scale.  This  can  be  done  by  calculating  the  mean,  /j,  where 

m 

p  =  (1/m)  2  ai  (2*a) 

i  -  1 

and  the  standard  deviation,  a,  where 


a  =  V  E  ((A  -  pV)  (2.b) 

where  m  =  total  number  of  data  samples. 

th 

a.=  value  of  the  l  attribute. 

l 

A  =  random  variable  which  can  assume  the  value  a.. 

E  =  expected  value  (statistical  sense). 

Various  models  may  be  chosen  to  represent  the  statistical  distribution  of  the 
attributes.  For  example,  if  a  Gaussian  distribution  is  chosen,  then  68%  of 
samples  will  be  distributed  within  one  standard  deviation  about  the  mean,  p,  and 
about  95%  of  the  samples  will  be  distributed  within  two  standard  deviations  about 
the  mean,  p.  According  to  the  mean,  p,  and  the  standard  deviation,  a,  found  for 
each  variable,  all  the  variables  are  normalized  and  digitized  to  a  predefined 
scale.  For  the  purposes  of  this  research,  0  through  9  has  been  selected. 

2.  Nort  hierarchical  clustering:  Non-hierarchical  clustering  is  based  on  the 

optimization  of  a  given  grouping  of  .objective  functions,  and  represents  the 
minimization  of  the  sum  of  the  variances  within  each  group  and  the 
maximization  of  the  sum  of  variances  between  groups. 

min  2  I  !  ai  “  aj  I  I 

C  6  p  (n,  M)  j— i ,  i  £  C.  (3) 

n  ,  —  —  2 

and  max  2  m ^  |  j  a^  -  a  1 1 

C  £  p  (n,  M)  j=l 

where  C  =  (Cl5  C2,  C3,  ...,  C  )  and  C.  represents  the  i^  cluster  group. 


M  =  (1,  2,  3,  ...  ,  m};  set  of  all  samples. 

p  (n,M)  =  set  of  all  cluster  groups  C  of  M  having  length  n. 

n  —  number  of  cluster  groups;  1  -  n  ^  m . 

a  =  the  expected  value  of  the  total  sample  of  attributes. 

a  .  —  the  expected  value  of  Ch. 

m  .  =  the  number  of  samples  in  C ,. 

J  H  J 
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Since  the  total  scatter  in  a  fixed  sample  size  is  constant  [10],  it  is  sufficient  to 
minimize  the  sum  of  variances,  W(n,M),  within  each  group.  Therefore,  eq.  (3)  can 
be  expressed  as  follows. 


min  W(n,M)  = 


C  €  p  (n,  M)  j=i,  i  €  C . 

J 


a.  -  a . 
i  J 


The  necessary  tools  for  the  clustering  process  are  described  below. 

To  calculate  the  new  mean  value  from  two  given  groups: 

_  1  _  _ 

a  ,  =  - ■? -  (  m  a  +  m  a  )  (5) 

p+q  mp  +  mq  p  p  q  q 

and  to  calculate  the  objective  value  (sum  of  variances)  of  the  two  given  groups: 

W  ,  =  W  +  W  +  m  *  [7  -  SV  ]  [“a  -7  .  ]T 
p+q  p  q  P  p  p+q  p  P+q 

+  m  *  [  a  -"a  ,  ]  [  a  -  a  (6) 

q  q  p+q  q  P+q 

3.  Clustering  strategy:  Since  the  number  of  all  possible  cluster  grotp  combinations 

(total  clustering)  can  become  prohibitively  large,  it  is  imperative  that  a 
reduction  in  the  number  of  clusters  be  attempted.  For  example,  say  m  samples 
(attribute  values)  have  to  be  clustered  into  less  than  or  equal  to  n  groups.  This 
number  of  clusters  is  given  by: 


H  (n,m)  =  i  (Hi' 

n'  i=t 


For  m  =  1000  and  n  =  15,  H  (n,m)  is  greater  than  1*10  .  For  this  reseach, 
m=10,000  and  150  ^  n  ^  300,  therefore  the  clustering  is  not  practically 
achievable.  As  a  result,  a  special  strategy  has  been  employed  to  alleviate  this 
problem.  Instead  of  searching  for  total  clustering,  the  OPTDEX-OLDM  program 
starts  from  m  samples  and  allows  each  ..ingle  sample  to  be  a  group,  i.e.  n  =  m. 
The  program  then  attempts  to  decrease  the  total  number  of  cluster  groups,  during 
each  clustering  cycle,  by  one.  During  each  cycle  the  program  searches  for  any  two 
groups  from  the  current  set  which  satisfies  the  criterion  of  equation  (4).  This 
clustering  process  terminates  when  the  number  of  groups,  denoted  by  n*,  satisfies 
the  following  condition, 

rmn  W  (n*,  M)  2  Wacceptable  (8) 

Based  on  this  clustering  strategy,  H 1  (n,m) ,  the  reduced  number  of  clusters  are: 

u,  ^  -  m2  "H2  (9) 


U  I  ~ 


For  the  m  =  1000  and  n  =  15  case,  H1  (n,m)  -  5E+05.  Athough  noise  may  bias  this 

type  of  clustering  in  the  very  early  stages  of  processing,  as  previously  predicted, 

12 

when  compared  with  the  increased  efficiency,  of  approximately  2*10  times,  it  is 
an  acceptable  strategy.  Flow  control  for  this  process  is  shown  in  Figure  4. 

Explanation  of  The  Statistical  Results 

An  explanation  facility  [17]  is  an  important  feature  which  distinguishes  artificial 
intelligence  programs  from  usual  programs.  Its  purpose  is  to  present  the 
computational  results  in  the  form  of  a  natural  language  so  that  is  comprehensible  to 
a  novice  user.  In  addition,  this  capability  forms  the  basis  of  incremental  machine 
learning.  A  simple  example  that  demonstrates  how  machine  learning  provides  an 
explanation  for  a  resulting  cluster  group  follows: 

Group  1.  Number  of  members  =17 


Attribute 

Range 

Mean 

Variance 

Nonlinearity 

0-9 

8 

3.0 

Strategy 

0-9 

2 

0.2 

Distance-of-Search 

0-9 

1 

1.0 

Response  from  the  OLDM: 

OLDM>  I  found  that  (as  supported  by  17  samples), 

IF 

Genera  1 1  y-speak i ng ,  the  nonlinearity  is  very-high,  and 

Definitely,  the  strategy  is  the  linear  extended  interior  penalty 
function  method. 

Then 

Most-likely,  optimization  searching  will  be  very-local. 

(Underlined  explanations  represent  terminology  derived  from  the  statistical  results). 
Classification  and  Incremental  Machine  Learning 

Automatic  concept  learning,  implemented  in  the  form  of  concept  learning 
generalization5,  has  been  shown  to  be  useful  in  interpreting  and  organizing  large 
amounts  of  information  about  a  domain  [?]  After  performing  the  initial  clustering 
from  the  test  samples  (ten  thousands  samples  in  this  case)  the  OPTDEX-OLDM 

^Concept  learning  generalization:  The  automatic  generalization  of  a  concept  based  on 
a  sufficiently  large  number  of  agreements  among  specific  case  (non-general) 
concepts.  In  other  words,  expanding  a  concept  to  include  a  more  general  class  of 
specific  cases  than  previously  included. 


program  reaches  approximately  500  conclusions.  These  conclusions  may  overlap  one 
another,  some  of  them  may  be  redundant  and  they  all  have  to  be  appropriately 
formatted  into  a  rule-based  expert  system. 

Creating  a  classification  scheme  is  typically  the  first  step  in  developing  the 
heuristics  (rules  of  thumb)  for  a  collection  of  observations  or  phenomena.  The  goal 
of  the  classification  scheme  is  to  structure  given  observations  into  a  hierarchy  of 
meaningful  categories  [6].  The  OLDM  applies  generalized-based  memory  to  build  up 
a  hierarchy  of  conclusions.  It  actually  constructs  a  connective  network  to  derive 
conclusions  in  a  canonical  form.  A  detailed  explanation  of  this  process  is  provided 
by  Lebowitz  [7].  An  important  feature  of  the  OLDM  is  its  ability  to  manage 
contradictions  between  conclusions,  referred  to  as  noise,  by  simply  counting  the 
number  of  supporting  members  for  each  conclusion.  For  example,  the  following 
conclusions  (non-generalized)  have  been  drawn  by  the  OLDM: 

Conclusion  1.  Supported  by  19  members  1 

If  the  Discontinuity  is  high  and 
the  Optimizer-choice  is  Golden-section-method 
then  the  objective  value  is  less-minimized. 

Conclusion  2.  Supported  by  25  members 

If  the  Discontinuity  is  low  and 
the  Optimizer-choice  is  Golden-section-method 
then  the  objective  value  is  less-minimized. 

Conclusion  3.  Supported  by  4  members 

If  the  Discontinuity  is  high  and 

the  Optimizer-choice  is  Golden-section-method 
then  the  objective  value  is  minimized. 


The  generalized  concept,  drawn  by  the  OLDM,  based  on  these  conclusions  is: 

OLDM>  CONCEPT-008: 

If  the  Discontinuity  is  high  or  low'1'  and 

comment:  +<this  result  is  based  on  the  generalization  of  conclusions  1  &  2> 

the  Optimizer-choice  is  Golden-section-method 
then  the  objective  value  is  less-minimized. ** 
comment:  ++<the  number  of  members  supporting  conclusion  1  is  greater  than 
the  number  supporting  conclusion  3> 

Another  important  feature  of  the  OLDM  is  its  ability  to  perform  on-line 
statistically  incremental  machine  learning.  The  OLDM  is  an  on-line  consultant  during 


numerical  optimization  processing  which  has  been  incorporated  within  the  ADS 

(Automated  Design  Synthesis)  optimization  library.  According  to  the  existing  rules 

and  local  information  from  updated  optimization  searching,  it  chooses  and  switches 

methods  combinations  from  ADS  and  feedsback  the  result  of  each  applied  rule.  These 

feedbacks  are  always  represented  in  a  standardized  format  with  14  parameters  as 

previously  described.  Each  piece  of  standardized  information  can  be  treated  as  an 

additional  test  sample,  a  ,  clustered  into  a  group,  C.  which  satisfies  the  following 

e  j 

condition. 


min 


m 


J. 


C  €  p  (n,  M)  m  .  +  1 
d 


[a  -a.]  [a  -  a 

t  e  e 


(8) 


During  the  incremental  machine  learning  process,  any  of  the  existing  cluster  groups, 
say  Ck,  such  that  >  Wacceptabie’  ^as  to  re-c^ustere8  by  utilizing  the 
procedures  which  have  been  discussed.  After  the  re-clustering  process  has  been 
completed,  new  concepts  (conclusion)  are  born  and/or  old  concepts  die.  This  is 
referred  to  as  the  birth-and-death  procedure  for  maintaining  and  renewing  concepts 
in  the  knowledge  base. 


Conclusion 

A  new  approach  to  design  optimization,  referred  to  as  cognitive  method  switching, 
using  nonlinear  programming  (NLP)  algorithms  applied  sequentially,  based  on  local 
design  information,  has  been  presented.  Statistical  evaluation  with  clustering  of 
attributes  associated  with  a  randomly  generated  problem  sample  data  base,  containing 
over  10,000  samples,  has  led  to  the  generation  of  guidelines  for  the  application  of 
NLP  algorithms  to  design  optimization  problems.  Continued  expansion  of  the  problem 
data  base  should  permit  more  generalized  guidelines  to  be  obtained  and  thereby  assist 
the  nonexpert  user  in  cognitively  selecting  an  appropriate  sequence  of  NLP  algorithms 
for  a  specific  design  optimization  problem. 
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TOWARD  A  NONEQUILIBRIUM  THERMODYNAMICS  OF  TWO  PHASE 
MATERIALS  WITH  SHARP  INTERFACE 


I 
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ABSTRACT.  This  is  a  review  of  recent  work  of  the  author  toward  the 
development  of  a  nonequilibrium  thermodynamics  of  two-phase  continua  based  on 
the  first  two  laws  in  forms  which  contain  interfacial  contributions  for  energy 
and  entropy.  Topics  discussed  are:  thermodynamic  restrictions  on  constitutive 
equations;  interface  conditions;  free-boundary  problems  for  solidification  and 
melting. 

1.  INTRODUCTION.  The  classical  theory  of  Stefan,  for  the  melting  of  a 
solid  or  the  freezing  of  a  liquid,  is  too  simplistic  to  account  for  the  myriad 
of  phenomena  which  occur  during  solidification  (an  example  being  dendritic 

growth,  in  which  simple  shapes  evolve  to  complicated  tree-like  structures).* 
Recent  attempts  to  rectify  this  situation  involve  replacing  the  classical 
free-boundary  condition. 


0(x.t)  =  0M  on  a(t),  (1.1) 

for  the  temperature  0(x,  t)  on  the  interface  a(t),  by  a  condition  in  which 
the  mean  curvature  H(x, t)  and  the  normal  velocity  V(x, t)  of  a(t)  are 
allowed  to  influence  the  temperature: 

0(x,t)  =  0M  -  hH(x, t)  -  bV(x.t).  ( 1 -2)2 

Here  0U,  a  constant,  is  the  transition  temperature,  the  temperature  at  which 

the  bulk  free  energies  of  the  solid  and  liquid  coincide,  while  h  and  b  are 
cons tants. 

The  relation  (1.2)  with  b  =  0  is  usually  derived  by  assuming  that  (at 
each  time)  the  interface  is  in  thermal  equilibrium  with  the  bulk  material,  and 
then  linearizing  the  interfacial  condition  obtained  as  a  consequence  of  Gibbs’ 
criterion  for  stability.  The  complete  relation  (1.2)  with  b  =  0  is 
generally  justified  on  an  ad  hoc  basis,  since  the  presence  of  the  normal 
velocity  V  precludes  the  use  of  equilibrium  thermodynamics. 


*Cf . ,  e.g.,  Chalmers  [1]  and  Delves  [2]  for  discussions  of  these  phenomena. 

2 

For  solidification  problems,  f ree-boundary  conditions  of  this  type,  with 
b  =  0,  were  introduced  by  Mullins  and  Sekerka  [3],  [4];  the  term  involving  V 
was  added  by  Voronkov  [5],  Seidensticker  [6],  and  Tarshis  and  Tiller  [7]. 

(See  also  the  review  articles  by  Sekerka  [8],  [9],  [10],  Chernov  [11].  Delves 
[2],  and  Langer  [12].) 
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One  expects  that  free-boundary  conditions  derived  in  this  manner  are  a 
valid  approximation  in  many  situations.  On  the  other  hand,  since  the 
underlying  physical  problem  involves  a  physical  system  out  of  -  although 
possibly  near  to  -  equilibrium,  it  would  seem  advantageous  to  develop  a 
nonequilibrium  thermodynamics  which  yields,  as  consequences,  appropriate 
free-boundary  conditions  for  the  interface  between  phases.  This  review 

3 

discusses  recent  work  [13]  by  the  author  toward  the  development  of  such  a 
nonequilibrium  thermodynamics . 

2.  BASIC  OONGEFTS.  The  work  [13]  begins  with  the  first  two  laws  in 
forms  which  are  appropriate  to  a  continuum  and  which  contain  interfacial 
contributions  for  energy  and  entropy;  but  to  avoid  inessential  complications, 
attention  is  restricted  to  nondeformable  bodies  in  the  absence  of  diffusion. 

A  fairly  general  constitutive  theory  for  the  interface  is  considered. 

The  free  energy  f  and  entropy  s  are  allowed  to  depend  on  the  temperature 
0  and  on  the  orientation  of  the  interface  through  a  dependence  on  its  unit 
normal  ■: 

f  >  f(0,«),  s  =  s(0 , ■) .  (2.1) 

(The  dependence  on  ■  is  included  to  model  crystal  growth. ) 

An  essential  requirement  of  the  theory  is  that  the  temperature  depend  on 
the  kinematics  of  the  Interface.  In  particular,  a  constitutive  relation 

0  =  0(V,m.L)  (2.2) 

giving  the  temperature  as  a  function  of  the  normal  velocity  V  of  the 
interface,  the  curvature  tensor  L  for  the  interface,  and  the  normal  ■  is 
introduced. 

One  might  expect  that  the  motion  of  the  interface  (relative  to  the 
underlying  material  structure)  induces  a  transfer  of  mechanical  energy  within 
the  interface.  To  allow  for  this  possibility,  a  tangential  vector  field  j 
is  introduced;  for  c  an  arbitrary  subsurface  of  the  interface  a, 

-  J  j-o  (2.3) 

dc 

represents  a  flow  of  energy  into  c  across  dc.  Here  v  (a  tangential 
vector  field)  is  the  outward  unit  normal  to  the  boundary  curve  dc.  The 
vector  field  J  is  called  the  accretive  energy  flux,  and  the  description  of 
the  interface  is  completed  by  adding  a  constitutive  equation 

j  =  j(v.«,L);  (2.4) 

interestingly,  for  an  isotropic  interface  this  flux  vanishes  identically. 


Triplets  (V.a.L)  in  the  common  domain  of  0  and  j  are  called  states, 
and  states  with  V  =  0,  L  =  0  are  called  equilibrium  states. 

It  is  not  clear  what  thermodynamic  restrictions  ought  to  be  placed  on 
these  constitutive  assumptions,  and  it  would  seem  appropriate  to  use  the 
second  law  -  in  the  manner  of  Coleman  and  Noll  [15]  -  to  derive  such 

4 

restrictions.  This  is  not  as  straightforward  as  it  seems.  For  a  rigid  heat 
conductor  the  treatment  of  Coleman  and  Noll  [15],  developed  for  single-phase 
materials,  is  based  on  the  hypothesis  that  the  second  law  be  satisfied  in  all 
processes  generated  -  through  the  constitutive  equations  -  by  smooth 
temperature  fields.  Here,  however,  there  is  an  additional  degree  of  freedom, 
the  evolution  of  the  interface,  and  the  constitutive  restriction  (2.2)  does 
not  allow  for  an  arbitrary  assignment  of  both  the  interface  and  the  underlying 
temperature. 

3.  THERMODYNAMIC  RESTRICTIONS  CH  CONSTITUTIVE  EQUATIONS.  Compatibility 
with  the  second  law  leads  to  the  following  constitutive  restrictions: 

(i)  the  free  energy  has  the  form 

f(e.rn)  =  fo(»)  +  fi(0); 

A  A 

(ii)  the  entropy  s(8,m)  =  s(0)  is  independent  of  ■  and  determined  by  the 
free  energy  through  the  entropy  relation 

s(9)  =  -a9ft(e); 

A  /V 

(iii)  the  accretive  energy  flux  j(V,a,L)  =  j(V,a)  is  independent  of  L 
and  linear  in  V: 

3(V,m)  =  -  Vf (■) :  (3.1) 

A 

( iv)  f (■)  is  determined  by  the  free  energy  through  the  stress  relation 

f(m)  =  -  da£0(m); 

(v)  given  any  state  (V.a.L), 

V{[*(0)]  -  Hf  -  3J-(a)-L}  >  0.  (3.2) 

where  ['/'(0)]  is  the  jump  in  bulk  free-energy  across  the  interface. 


^Murdoch  [16]  has  applied  this  procedure  to  interfaces  which  do  not  move 
relative  to  the  underlying  material. 


Note  that  (3.1)  reduces  the  energy  flow  (2.3)  to 

J  v£(«)-o. 

dc 

This  integral  has  an  obvious  interpretation  as  power  expended  on  c.  with 
~  5 

f(-)*»  a  force  in  the  direction  normal  to  the  interface.  For  that  reason 

A 

f (■)  is  called  the  accretive  stress. 

One  might  object  to  the  constitutive  equations  (2.1),  (2.2),  and  (2.4), 
as  they  are  not  consistent  with  the  principle  of  equipresence.  Consider 
instead  the  system 


f  =  f  (V.rn.L)  ,  s  =  s(V.n.L), 

0  =  0(V,n,L).  }  =  j(V.n.L). 


(3.3) 


Near  equilibrium  this  system  Is  no  more  general  than  the  original  system 
(2.1),  (2.2),  and  (2.4).  Precisely,  it  is  shown  that,  if  (3.3)  is  compatible 
with  thermodynamics,  then  there  exist  a  neighborhood  of  equilibrium  N  and 

A  A 

constitutive  functions  f (6,m)  and  s(6,m)  such  that 


f (V.rn.L)  =  f  (0(V,m,L)  ,■) ,  s(V,m.L)  =  s(0(V.»,L)  .■) 


on  N. 


In  classical  theories  of  melting  -  in  which  the  interface  is  devoid  of 
structure  -  changes  of  phase  occur  at  the  transition  temperature  0jj.  Within 

the  present  theory  a  consequence  of  the  inequality  (3.2)  is  that  the  interface 


have  temperature  0^  at  equilibrium. 


0(V,*,L)  =  0jj  whenever  V  =  0,  L  =  0, 

but  that  away  from  equilibrium  this  need  not  be  so;  in  fact, 

0  =  0M  -  f(n)H  -  b(m)V  +  divaf(m) 


is  the  linear  approximation  to  (2.2)  near  equilibrium.  Here  f(m)  =  f(0jj,m) 

A, 

is  the  interfacial  free  energy  at  equilibrium,  f(n)  =  £(m)  is  the  accretive 


Si  thin  a  purely  statical  theory  such  a  force  was  introduced  by  Cahn  and 
Hoffman  [17],  u18],  whose  work  pointed  out  the  need  for  a  term  of  this  form  in 


the  energy  equation  when  the  interface  is  anisotropic.  The  vector  f( m)  is 
actually  the  tangential  part  of  the  vector  used  by  Cahn  and  Hoffman. 
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stress,  div  is  the  surface  divergence,  b(w)  is  an  orientation-dependent  j 

&  I 

constant,  and  we  have  chosen  a  scaling  in  which  the  latent  heat  6  satisfies  : 


4.  FREE  BOUNDARY  NKSLD6.  Approximate  interface  conditions  are  derived  for 
a  weak  interface,  that  is,  one  in  which  the  interfacial  densities  are  small 
and  the  dependence  on  V  and  L  weak.  These  interface  conditions,  when 
combined  with  the  usual  quasi-static  heat  equation  in  bulk,  lead  to  the 
following  system  of  partial  differential  equations  and  free-boundary 
conditions  for  the  temperature  difference  u  =  0  -  0^: 


div  q  =  0, 

q  =  -  K^vu 

in  Bj. 

u  =  -  f(w)H  -  b(w)V  +  div  f (■) , 

** 

[q]-«  =  e\ 

on  a. 

Here  a  =  a(t)  is  the  interface;  Kj  is  the  conductivity  tensor  for  phase 
i;  is  the  region  of  space  occupied  by  phase  i,  q  is  the  bulk  heat 

flux;  [q]  is  the  jump  in  q  across  the  interface. 

Global  growth  conditions  are  found  for  the  system  (4.1).  To  state  these 
succinctly,  consider  a  bounded  solid  B(t)  in  an  infinite  liquid  melt,  and 
write 

F(*)  *  J  f(-) 

a 

for  the  total  interfacial  free-energy  computed  using  the  equilibrium  values  of 
the  corresponding  density.  Then: 


vol(B) *  =  0,  F(a)*  i  0  (4.2) 

provided  the  liquid  is  thermally  isolated  at  infinity,  while 

f(a)*  +  u0  vol(B)*  i  0  (4.3) 

whenever  the  liquid  is  isothermal  at  infinity.  Here  u^  is  the  (constant) 
far-field  temperature-difference. 

The  results  (4.2)  and  (4.3)  motivate  two  variational  problems: 

(VI)  minimize  F(a)  subject  to  vol(B)  =  constant; 

(V2)  minimize  F(a)  +  um  vol(B). 


The  problem  (VI)  and  the  problem  (V2)  with  u{D>  0  are  well  posed.  On  the 
other  hand,  (V2)  with  u^  <  0  has  no  solution,  as  all  minimizing  sequences 


have  vol(B)  -»  <*>.  This  is  as  expected:  u0  <  0  corresponds  to  a  solid  in  a 

supercooled  liquid  melt,  and  vol(B)  -*  00  indicates  the  ultimate  envelopment 
of  the  liquid  by  the  more  stable  solid  phase. 
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ABSTRACT 

The  shear-lag  model  is  applied  to  a  monolayer,  unidirectional,  fiber-reinforced 
composite  loaded  in  tension.  The  monolayer  contains  an  infinite  number  of 
parallel  fibers,  with  an  arbitrary  number  of  them  broken  simultaneously.  While 
the  fibers  are  modelled  as  linearly  elastic,  a  linear  viscoelastic  constitutive 
law  is  assumed  for  the  matrix  material.  The  time  evolution  of  the  overstress 
profiles  in  the  fibers  and  matrix  near  breaks  is  determined.  The  time 
dependence  of  the  effective  load  transfer  length  is  also  calculated.  Explicit 
evaluations  of  the  above  quantities  are  given  for  a  power-law  creep  compliance 
model,  suitable  for  most  epoxy  thermosetting  resins  as  matrix  materials. 


INTRODUCTION 

The  shear-lag  model  for  a  unidirectional  composite  was  developed  by 
HEDGEPETH  (1961)  as  an  attempt  to  describe  the  stress  fields  around  broken 
fibers.  It  is  a  simplified  micromechanics  model  for  which  closed  form 
solutions  can  be  obtained.  In  Hedgepeth’s  analysis  the  fibers  are  parallel, 
equally  spaced  and  of  infinite  length.  The  monolayer  includes  an  infinite 
number  of  fibers  with  a  cluster  of  them  broken  (see  Fig.  1)  and  is  loaded  by 
uniformly  distributed  tensile  tractions  in  the  direction  of  the  fibers.  Both 
fiber  and  matrix  materials  are  assumed  to  be  linearly  elastic.  The  drastic 
simplification  introduced  by  the  shear-lag  model  is  the  decoupling  between  the 
mechanisms  that  respond  to  shear  and  normal  stresses  in  the  composite.  It  is 
thus  assumed  that  the  fibers  alone  bear  the  normal  stresses  along  the  fiber 
direction,  while  the  matrix  material  acts  only  as  a  shear  transfer  mechanism 
that  overloads  the  adjacent  fibers  in  tension  whenever  a  fiber  breaks. 

The  method  of  influence  coefficients  was  used  for  the  solution  of  the 
above  problem  and  the  explicit  evaluation  of  the  overload  coefficients  of  the 
Intact  fibers  due  to  fiber  breaks  was  given  by  HEDGEPETH  (1961).  Closed-form 
solutions  in  terms  of  Bessel  and  Weber  functions  for  the  overload  and 
displacement  fields  of  the  fibers  were  reported  by  FICHTER  (1969,1970),  who 
also  looked  into  the  problem  of  more  than  one  groups  of  breaks.  A  later  work 
by  HEDGEPETH  and  VAN  DYKE  (1967)  incorporates  an  elastic-perfectly  plastic 
model  for  the  matrix  material.  In  a  subsequent  work  VAN  DYKE  and  HEDGEPETH 
(1969)  assumed  that  the  matrix  fails  completely  when  a  maximum  shear  stress  is 


reached.  A  modified  version  of  Hedgepeth's  shear- lag  analysis  was  undertaken 
by  ERINGEN  and  KIN  (1974),  who  took  into  account  the  normal  stresses  in  the 
matrix  transversely  to  the  direction  of  the  fibers.  Along  the  same  lines  was 
the  analysis  of  GOREE  and  GROSS  (1979)  with  the  additional  inclusion  of 
longitudinal  yielding  and  splitting  of  the  matrix  and  later  on  an  extension  to 
the  3-D  case  (GOREE  and  GROSS,  1980).  Comparisons  of  the  predictions  of  the 
shear-lag  model  with  3-D  finite  element  calculations  were  done  by  REEDY  (1984). 
He  found  excellent  agreement  between  the  two  methods  for  the  fiber  stress 
concentrations  in  a  Kevlar/epoxy  monolayer  for  load  levels  that  do  not  cause 
matrix  yielding. 

In  the  present  work  we  analyze  the  time  response  predicted  by  the 
shear- lag  model  of  a  unidirectional,  monolayer  composite  with  an  infinite 
number  of  parallel  fibers  loaded  in  tension  in  the  direction  of  the  fibers,  by 
assuming  a  time  dependent  consitutive  model  for  the  matrix  material.  We  take 
the  matrix  to  be  linearly  viscoelastic,  and  as  a  special  case  we  investigate 
the  consequences  of  a  power- law.  time  dependent,  creep  compliance  on  the  time 
evolution  of  the  overstress  profiles  around  broken  fibers.  Such  a  power- law 
creep  compliance  is  commonly  used  to  model  the  time  response  of  epoxy 
thermosetting  resins,  which  are  often  used  as  matrix  material  for  non-metallic 
composites  (POMEROY,  1978).  A  linear  viscoelastic  model  for  the  matrix  has 
previously  been  used  by  LIFSHITZ  and  ROTEM  (1970)  in  their  statistical  theory 
of  failure  for  composites,  where  Schapery's  approximate  technique  was  used  to 
obtain  the  time-dependent  solution  of  a  shear-lag  model  that  lumped  all  broken 
fibers  into  a  single  fiber. 

In  the  first  section  the  formulation  of  the  shear-lag  problem  is  presented 
for  a  unidirectional  composite  under  tension  with  broken  fibers  and  a  linearly 
viscoelastic  matrix.  Also  described  is  the  method  of  solution  which  uses 
Laplace  transforms  and  finite  cosine  transforms.  In  the  second  section  a 
power-law  creep  compliance  is  assumed  for  the  matrix,  and  explicit  evaluations 
of  the  overloads  in  the  adjacent  intact  fibers,  the  shear  stresses  in  the 
matrix  and  the  effective  load  transfer  length  are  carried  out. 


1.  FORMULATION  OF  THE  SHEAR-LAG  PROBLEM 

The  model  of  a  thin,  unidirectional  laminate  is  shown  in  Fig.  1,  where  all 
fibers  are  identical  and  parallel  to  the  X  axis  and  have  an  equal  center-line 
spacing  H.  The  laminate  is  considered  to  be  a  two-dimensional  infinite  region 
with  an  infinite  number  of  fibers,  out  of  which  (2N+1)  neighboring  fibers  are 
broken  along  the  Y  axis  at  time  T  *  0.  We  are  interested  in  calculating  the 
subsequent  stress  fields  near  the  breaks  in  the  fibers  and  the  matrix. 

Both  the  X  and  Y  axes  are  axes  of  symmetry  for  the  laminate  in  terms  of 
geometry  and  loading.  The  external  loading  is  uniform  tension  applied  in  the 
direction  of  the  fibers,  which  are  taken  to  be  the  only  tensile  load  carriers. 
This  is  a  justifiable  assumption  for  most  non-metallic  composites  because  the 
Young's  modulus  of  the  matrix  is  usually  one  or  more  orders  of  magnitude  less 
than  the  axial  Young’s  modulus  of  the  fibers. 

The  thickness  of  the  laminate  B  and  the  fiber  spacing  H  are  of  the  same 
order  as  the  diameter  of  the  fibers  D,  which  is  small  compared  to  the  length  of 
the  fibers  L.  If  we  take  as  a  reference  length  unit  the  fiber  diameter  D,  then 
L  -»  °o.  The  width  of  the  laminate  becomes  infinite  in  this  length  scale  as 
well,  as  it  consists  of  a  large  numbers  of  fibers.  The  infinite  laminate  model 
is  therefore  a  good  approximation  to  the  real  configuration  of  the  composite, 
at  least  before  extensive  breaking  of  the  fibers  has  taken  place.  If  the 
clusters  of  breaks  are  not  sufficiently  far  away  from  each  other,  their 
interactions  should  be  taken  into  account.  However,  in  the  linear  theory  the 
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superposition  principle  can  be  applied,  and  the  problem  reduces  again  to  the 
infinite  domain  problem  with  only  one  group  of  brekks  (the  (2N+1)  broken  fibers 
in  our  analysis). 

The  mechanism  of  the  shear-lag  model  is  a  highly  idelized  one.  In  the 
absence  of  breaks  the  whole  laminate  is  in  a  homogeneous  stress  state  with  the 

only  nono-zero  stresses  being  the  constant  normal  stresses  4P(B/irD2  in  the  axial 

direction  of  the  fibers.  The  load  Pm  is  the  constant  tensile  load  applied  to 

each  fiber  at  an  infinite  distance  from  the  breaks.  The  matrix  material  is 
normal  stress  free  before  any  breaks  occur.  This  is  true  if  sufficient  time 
has  elapsed  from  the  loading  of  the  composite  so  that  stress  relaxation  in  the 
matrix  has  occured.  Approximately  the  above  is  true  for  any  time,  since  we 
have  assumed  that  the  fibers  are  much  stlffer  them  the  matrix  in  tension.  As 
soon  as  one  fiber  breaks,  the  load  of  that  fiber  near  the  break  is  transferred 
to  the  neighboring  fibers  by  means  of  shear  forces,  which  are  exerted  on  the 
matrix  material  through  the  fiber-matrix  Interface. 

A  free  body  diagram  of  an  infinitesimal  portion  of  the  n**1  fiber  together 
with  its  surrounding  matrix  is  shown  in  Fig.  2.  Even  though  the  fibers  are 
cylindrical  and  the  stress  fields  in  the  laminate  are  Inherently  three- 
dimensional,  we  simplify  the  problem  by  first  assuming  constant  normal  stresses 
in  all  cross-sections  perpendicular  to  the  fiber  axis.  We  then  assume  constant 
shear  stresses  in  the  matrix  in  the  XZ  plane  in  the  Z  direction,  and  in  the  Y 
direction  between  two  neighboring  fibers.  To  justify  the  last  assumption  we 
introduce  an  effective  width  of  the  matrix  layer  between  two  neighboring 

fibers,  such  that  the  product  (BH^)  gives  the  matrix  cross-sectional  area 

(BH  -  irD2/4)  between  these  fibers.  It  is  obvious  from  the  above  that  the 

effective  width  H^.  must  be  equal  to  (H  -  wD2/4B).  If  B  is  substantially  larger 

than  D.  the  requirement  of  constant  shear  stresses  in  the  Z  direction  is  not 
valid  any  more,  and  an  effective  thickness  B^  has  to  be  introduced.  As  a  first 

approximation  we  can  choose  Bf  =  D,  in  wich  case  Hf  =  H  -  wD/4.  The 

assumptions  about  the  effective  width  and  the  effective  thickness  require  the 
notion  of  an  effective  shear  modulus  for  the  matrix,  to  be  determined  by 

experiments.  The  effective  shear  modulus  will  in  general  be  different  for 

different  cross-sectional  geometries  of  the  fibers  and  different  ratios  B/D. 
Detailed  discussion  on  the  selection  of  and  B^  is  given  by  REEDY  (1984). 

Further  simplifications  introduced  by  the  shear-lag  model  concern  the  normal 
stresses  in  the  matrix  in  the  X  direction,  which  are  neglected  for  reasons 
mentioned  earlier.  The  normal  stresses  in  the  matrix  in  the  Y  direction  are 
assumed  to  remain  constant  throughout  the  effective  width  of  the  matrix.  Any 
out-of-plane  stresses  in  the  fibers  and  in  the  matrix  are  neglected  as  well,  as 
the  problem  is  assumed  to  be  two-dimensional  in  the  above  introduced  effective 
configuration. 

By  taking  into  account  the  above  simplifications  and  in  the  absence  of 
inertial  forces,  equilibrium  of  forces  in  the  X  and  Y  directions  (see  Fig.  2) 
results  in  the  following  equations: 


a5T  +  Bf<7n+1  -  yn>  =  0 
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where  P  is  the  normal  load  in  the  n  fiber.  7  is  the  shear  stress  in  the 
n  n 


matrix  between  the  nC  and  (n-1)*  fibers,  and  is  the  normal  stress  in  the 


matrix  between  the  n  and  (n-1)  fibers  in  the  Y  direction.  Eqn  (1)  Implies 
that  the  variation  of  the  normal  load  along  a  fiber  is  due  to  the  difference  in 
the  shear  stresses  applied  by  the  matrix  on  both  sides  of  that  fiber.  Eqn  (2) 
implies  that  the  dependence  of  the  matrix  shear  stress  on  the  distance  from  the 
breaks  results  in  non-zero  normal  stresses  in  the  matrix  in  the  Y  direction. 
These  normal  stresses  are  maximum  near  the  breaks,  where  we  expect  the  largest 
variation  in  the  shear  stress,  and  they  might  be  important  in  the  analysis  of 
the  fiber-matrix  interface,  for  example  in  the  case  of  debonding.  Note  that 
equilibrium  of  moments  does  not  hold  in  the  infinitesimal  element  of  Fig.  2,  as 
a  result  of  neglecting  the  shear  stresses  in  the  fiber  cross-sections  in  the  Y 
direction,  unless  we  assume  that  the  ratio  D/H^.  is  very  small.  Since  D  is  of 


the  same  order  as  for  most  applications,  we  propose  the  use  of  a  correction 


factor  that  restores  balance  of  moments  by  replacing  with  , 


<  n  <  °°,  in  eqn  (2). 

Upon  specifying  constitutive  relations  for  the  matrix  and  fibers,  the 
above  set  of  equations  becomes  field  differential-difference  equations  for  the 


determination  of  the  displacement  fields  and  of  the  n  fiber  along  the  X 


and  Y  directions,  respectively,  as  functions  of  position  X  and  time  T.  In  the 
present  work  we  assume  that  the  fibers  are  linearly  elastic,  namely 


Pn  =  AE®f 


where  A  is  the  fiber  cross-sectional  area  and  E  is  the  axial  Young’s  modulus  of 
the  fibers.  The  matrix  material  is  taken  to  be  linearly  viscoelastic  in  shear, 
that  is 


chr  (X.S) 

G(T-S)  — §s - dS 


where  G(T)  is  the  relaxation  modulus  and  -rn(X,T)  is  the  shear  strain  in  the 


matrix.  In  order  to  decouple  the  system  of  eqns  (1)  and  (2)  in  Un  and  V  ,  we 

approximately  take  ~t  =  (U  -  U  )/H_  by  neglecting  the  term  3Y(V  .  +  V  )/2 

n  xi  xi“  i  i  a  n- 1  n 

(ERINGEN  and  KIM,  1974),  in  which  case  (4)  reduces  to 


i  rrT  au  rT  au  .  ■> 

*  BtU  g(t*s)  as1  *  -  J  G<T'S>  dF1  H  • 

J  —09  *  —09  J 


We  nondimensionalize  the  time  variable  by  dividing  T  by  some 


characteristic  time  Tq  of  the  matrix  material,  to  be  found  by  creep 


experiments,  so  that  t  s  T/Tq.  a^so  define  a  normalized  relaxation  modulus 


ftTSTiT 


*$(  t)  =  G{  ^qJ/Gq.  where  Gq  is  the  instantaneous  elastic  shear  modulus  of  the 

matrix  material.  (In  this  work  lower  case  letters  and  script  letters,  except 
for  the  script  letter  9  used  to  denote  dimensional  shear  stress,  denote 

dimensionless  quantities,  while  upper  case  letters  stand  for  dimensional 
quantities. ) 

If  we  introduce  the  integral  operator  $  so  that  its  action  on  a  function 
f(C)  is  given  by 


(6) 


substitution  of  (3)  and  (5)  into  (1),  upon  using  (6),  yields  second  order 
differential-difference  equations  for  the  determination  of  Un>  namely 


AEHf  a2u 

_ _ f_ _ n 

GoPf  ax2 


+  9<U 


n+1 


-  2U 


n 


U  .>  =  0 
n-1 


-®  <  n  <  » 


(7) 


If  the  solution  to  (7)  can  be  found,  substitution  of  into  (5)  yields  the 

shear  stresses  9  and  hence  eqns  (2)  can  be  solved  for  V  .  V  can  be  easily 

determined  if  a  linearly  elastic  constitutive  model  is  selected  for  the  normal 

stresses  in  the  matrix  perpendicular  to  the  fiber  direction,  i.e.,  I1"  =  E  (V  - 

n  mv  n 

Vn_j)/Hj.  (E^  is  the  effective  Young's  modulus  for  the  matrix).  If  we  use  a 

linear  viscoelastic  model  for  the  normal  stresses,  the  Laplace  transform  method 

can  be  used  to  render  eqns  (2)  algebraic  in  Vn>  the  Laplace  transformed 

displacement  V  .  The  decoupling  of  the  vertical  and  horizontal  displacements 

allows  us  to  consider  only  eqns  (7)  in  our  solution  procedure. 

X  and  Un  are  normalized  so  that  the  field  equations  and  the  boundary  and 

initial  conditions  are  Independent  of  the  material  parameters.  If  we  select  x 
=  X/XQ  =  X/nAEHj/GqBj.  and  un(x.t)  s  Un(X,T)/>IP2Hf/G0AEBf .  eqns  (7)  become 


a2u 

- —  +  .Ku  ,  -  2u  +  u  .  >  =  0  ,  -»<n<°°  (8) 

a.  2  n+x  n  n-x  v  ' 


The  boundary  conditions  are  given  by 
du 

=1  .  -a><n<00  ,  x  ->  00  .  t  >  0  ,  (9a) 


du 

a-2-  =  0  .  -N^n^N  .  x  =  0  .  t>0  .  (9b) 


<  n  <  -N 


N  <  n  < 
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x  =  0 


t  >  0 


(9c) 
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while  the  initial  conditions  are 


u  =  x 
n 


<n<«  ,  x  £  0  .  t  =  0 


In  order  to  avoid  unbounded  displacement  fields  in  the  analysis,  we  perform  the 
transformation 


nr  =  u  -  x 
n  n 


which,  after  its  substitution  into  eqns  (8),  (9)  and  (10),  results  in  the 

following  field  equations,  boundary  and  intital  conditions: 


+  ^<w  , .  -  2w  +w  ,>=0  ,  -®<n<“ 

n+ 1  n  n—  1 


w  =0 
n 


w  =  0 
n 


-«  <  n  <  « 


-N  £  n  £  N 


-®  <  n  <  -N 


-»  <  n  <  “ 


x  -»  »  ,  t  >  0  , 


x  =  0  ,  t  >  0  , 


N  <  n  <  00  .  x  =  0  ,t>0  , 


x  £  0  .  t  =  0  . 


(13a) 


(13b) 


(13c) 


Notice  that  the  field  equations  remain  unchanged  in  form.  This  is  because  the 
transformation  (11)  is  a  time  independent  translation.  The  change  in  the 
boundary  conditions  has  altered  the  original  problem  into  a  new  one,  in  which 
there  are  no  loads  at  infinity  and  there  are  only  compressive  loads  applied  on 
the  broken  fibers  suddenly  at  t  =  0,  which  open  up  the  breaks  as  t  grows. 

The  above  equations  can  be  solved  by  using  Laplace  transforms.  The 

Laplace  transform  of  *<wn>  is  given  by  the  convolution  law  L($< wr>)  =  s^(s) 

»n(x,s)  •  w^ere  ^(s)  and  wn(x.s)  are  the  Laplace  transforms  of  "3(t)  and 
wn(x,t),  respectively.  The  Laplace  transforms  of  (12)  and  (13),  upon  using 
( 14) ,  become 


32w  (x.s) 


+  s<5(s)[wn+1(x,s)  -  2wn(x,s)  +  wn  l(x,s)]  =  0,  <n<  «  ,  (15) 


=  0  ,  -00  <  n  <  00  ,  x-»® 


-N  £  n  £  N 


x  =  0 


(17a) 
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We  have  thus  transformed  the  original  viscoelastic  problem  into  an  elastic 
shear-lag  problem  (correspondence  principle,  CHRISTENSEN,  1982).  We  will 
follow  here  the  methodology  presented  by  ERINGEN  and  KIM  (1974)  and  used  also 
by  GOREE  and  GROSS  (1979)  for  the  solution  of  the  elastic  shear-lag  problem, 
which  is  a  dual  Integral  equations  technique.  However,  one  can  also  use  the 
influence  function  technique  developed  by  HEDGEPETH  (1961). 

We  reduce  eqns  (15)  to  a  single  differential  equation  by  introducing  the 

finite  cosine  transform  (CHURCHILL,  1972).  Define  w  by 


w 


wncos(n0) 


0  <  0  <  7T  . 


with  the  inversion  formula  given  by 


w^  =  w  cos(n0)  d0 


(18a) 


(18b) 


where  w  =  w(x,s.0),  wr  =  wn(x,s).  By  summing  eqns  (15)  with  n  running  from  -® 

to  <*>.  after  having  multiplied  them  by  cos(n 0).  and  by  taking  into  account  the 

symmetry  w  (x.s)  =  w  (x.s).  it  is  found  that  w  satisfies 
n  -n 

—  -  4s<$(s)  sin2 (0/2)  w  =  0  .  (19) 

dx2 

The  resulting  simplification  in  the  field  equations  has  shifted  the  difficulty 
into  the  boundary  conditions,  which  turn  out  to  be  integral  equations,  namely 


(20) 


.  0*n$N  ,  x  =  0  .  (21a) 


N  <  n  <  ®  ,  x  =  0  .  (21b) 


A  solution  to  (19)  that  satisfies  the  boundary  condition  (20)  is  given  by 


dw 

dx 


=  0  ,  x  ->  00  , 


cos(n0)  d0  =  -  — 


r  aw 

J0 * 

|  w  cos(n0)  d0  =  0 


w  =  f(s,0)  exp[-2sin(0/2)  xJs^(s)  ] 


(22) 


for  some  f(s,0).  Substitution  of  (22)  into  (21a)  and  (21b)  yields  the 
conditions 


r  f(s.e) 

Jo 


sin(0/2)  cos(n0)  d0  = 


1 


.  0  i  n  i  N  . 


(23a) 


I*  Hs.e) 

J0 


2s'Js‘5(s) 

cos(n0)  d0=O  .  N  <  n  <  «  . 


(23b) 


N 


for  f(s,0).  By  letting  f(s,0)  =2  a  (s)  cos(m0),  the  conditions  (23a)  for  the 

m=0  " 


broken  fibers  reduce  to 


N  fw 

2  a  (s)  ! 

m=0  m  JO 


sin(0/2)cos(n0)cos(m0)d0  = 


.  0  1  n  *  N  . 


(24) 


2s4 s^s) 


while  conditions  (23b)  for  the  unbroken  fibers  are  satisfied  identically.  The 
complete  satisfaction  of  the  boundary  conditions  reduces  then  to  the  solution 
of  the  algebraic  sustem  (24)  of  (N+l)  equations,  for  the  determination  of  the 


(N+l)  unknown  functions  am(s),  m  =  0,1,2,...,N.  The  solution  to  the 


transformed  problem  is  found  by  substituting  w  from  (22)  into  (18b)  and  is 
given  by  the  following  expression: 


N  fir  _ _ 

n  (x.s)  =  2  a  (s)  exp[-2sin(0/2)'ls‘S(s)  x]  cos(m0)cos(n0)d0  .  (25) 

n  m=0  m  J0 


The  inversion  of  the  Laplace  transforms  of  wr  will  result  in  wn(x,t).  The 


difficulty  of  the  inversion  will  mainly  depend  on  the  selection  of  the 
constitutive  model  (i.e..  §(s))  for  the  viscoelastic  matrix. 


A  clarifying  remark  regarding  the  number  of  broken  fibers  is  mentioned  at 
this  point.  We  have  assumed  that  the  number  of  breaks  is  an  odd  integer, 
namely  2N+1 .  and  as  a  consequence  we  have  used  the  finite  cosine  transform 


(18).  taking  into  account  the  symmetry  of  wr  about  x  axis.  We  could  easily 


model  any  number  of  breaks  by  using  the  finite  exponential  transform 
(CHURCHILL.  1972).  which  is  given  by 


n=+» 


w(x, s , 0)  =  2w  2  wn(x,s)  exp(in0) 


(18c) 


n=- 


;n(*.s)  =  j 


w(x,s .0)  exp(-in0)  d0 


(18d) 


and  reduces  to  the  finite  cosine  transform  whenever  w  =  w  ,  or  w  is  symmetric 

n  -n 


N 


in  0.  The  only  change  in  the  previous  analysis  is  that  now  f(s,0)  =  2  am(s)* 


m— M 


exp(-im0),  where  the  total  number  of  breaks  is  (M+N+l)  and  the  algebraic  system 


(24)  Involves  (M+N+l)  unknown  functions  a  (s). 

in 

The  important  quantities  in  the  analysis  of  shear-lag  models  are  the 

overloads  in  the  fibers  near  breaks  and  the  shear  stresses  in  the  matrix.  The 

nondimens ional  loads  in  the  fibers,  defined  by  p  (x.t)  =  P  (xXL.  tT~)/T*  can  be 

n  nv  0  O'  ® 

found  by  substituting  wn(x,t)  from  (25)  into  (3)  upon  using  (11).  and  they  are 
given  by 

dw  (x.t) 

pn(x.t)  =  — ^ - +  1  .  n  >  0  .  (26) 

The  normalized  shear  stresses  Tn(x.t)  =  ( xXq . tT^ ) A P^Gq/AEB^Hj.  between  the 

n**1  and  the  (n-l)^  fibers  are  evaluated  by  substitution  of  wn(x,r)  into  (5) 
(which  upon  using  (11)  yields  the  normalization),  and  they  are  given  by 

rt  d(w  -  w  . ) 

Tn(x.t)  =  Jo  <S(r-C)  - B-— df  .  n  *  1  .  (27) 

Another  useful  quantity,  especially  for  statistical  models  of  failure  of 
composites  (PHOENIX  and  TIERNEY,  1983),  is  the  effective  load  transfer  length 
Lf,  which  for  present  purposes  is  defined  as  the  distance  from  the  breaks  in 

the  x  direction,  within  which  the  overload  of  the  first  unbroken  fiber  has 
dropped  to  zero.  Since  in  the  shear- lag  model  the  load  P^+^  of  the  first 

intact  fiber  actually  descends  to  values  below  P^  before  it  decays 

exponentially  to  Pa  as  x  ■*  ®,  we  define  as  the  distance  from  the  breaks  at 

which  PN+1  crosses  P^.  In  this  case  or  equivalently  the  normalized 

effective  load  transfer  length  1^  =  Lj./'IAEH^/GqB^  must  satisfy  the  conditions 


^N+l^f’^ 

Pjj+j(l£.t)  =  1  ,  or  =  0  .  (28) 

In  general  1^.  will  depend  on  time  because  depends  on  time.  The  so  defined 

lj.  becomes  a  characteristic  length  for  the  whole  laminate  for  a  given  number  of 
breaks  (2N+1). 

We  summarize  the  results  of  this  section  by  giving  explicit  evaluations 

for  the  various  quantities.  If  we  define  bm  =  am(s)  2s'ls‘5(s),  then  b  are 
determined  by  solving  the  algebraic  system 

N  , v 

2  b  sin(0/2)cos(n0)cos(m0)d0  =  1  .  0  <,  n  i  N  .  (29) 

m=0  J0 


which  is  independent  of  s.  Eqns  (25), (26)  and  (27)  reduce  to 


N  (*  -  _= _ 

r  (x.t)  =  2  b  L  {exp[-2sin(0/2)>ls^(s)  x]/(2s>ls<S(s))} 

n  m=0  m  J0 


cos(m0)cos(n6)d6 


N  pr  _1  _ _ 

?n(x.t)  =  1-2  bm  J  L  i{exp[-2sin(0/2)'ls<5(s)  x]/s) 


sin(0/2)cos(m0)cos(n0)d0  , 


N  fir  __ -  ^ - 

■  (x.t)  =  2  b  L  {exp[-2sin(0/2)'Js‘5(s)  x]  4s^(s)/2s} 

n  m=0  m  J0 


•  cos(m0)[cos(n0)  -  cos((n-l)0)]d0 


where 


-1  -  1  r+ip 

f(t)  =  L  [f(s)]  =  2^  11®  I  exp(ts)f (s)ds  .  t  >  0  . 

J  tt— 1/3 


2.  POWER-LAW  CREEP  COMPLIANCE  MODEL  FOR  THE  MATRIX  MATERIAL 

A  useful  model  that  describes  closely  the  viscoelastic  properties  of 
commercially  used  matrix  materials  (epoxy  thermosetting  resins)  is  a  power-law 
creep  compliance  model  that  can  be  expressed  in  the  form 

J(T)  =  JQ  [1  +  (^)“  ]  =  JQ(1  +  ta)  5  JQ  *(t)  .  (34) 

Here  Jq  characterizes  the  instantaneous  elastic  response  of  the  matrix  material 

under  loading  and  Tq  and  a  are  material  constants  that  describe  the  creep 

behavior  under  dead  loading.  The  characteristic  time  Tq  is  the  time  required 

for  the  initial  displacement  to  be  doubled,  while  the  exponent  a  is  usually 
much  smaller  than  unity.  The  limit  a  -»  0  corresponds  to  the  elastic  case, 
while  a  -»  1  gives  a  linear  time  dependence  which  is  equivalent  to  the  Maxwell 
viscoelastic  model.  The  connection  between  the  relaxation  modulus  *i(t)  and  the 
creep  compliance  $(t)  is  expressed  through  the  Laplace  transformed  quantities 
(CHRISTENSEN,  1982)  by  the  well-known  formula 

*(s)  J( s)  s2  =  1  .  (35) 

if  Gq  =  1/J0.  From  (34)  and  (35)  the  Laplace  transform  of  the  relaxation 
modulus  is  found  to  be 
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By  inserting  (36)  into  (30),  (31)  and  (32)  it  is  possible  to  obtain 


explicit  evaluations  for  w  ,  p  and  r  in  terms  of  x  and  t  for  different  values 

n  n  n 


of  a.  The  inversion  of  the  Laplace  transforms  has  been  done  by  contour 
integration.  We  will  only  report  the  solution  here  for  the  fiber  loads  and  the 
shear  stresses,  while  the  displacement  fields  can  be  obtained  by  integrating 
(26).  The  fiber  loads  and  the  shear  stresses  are  found  to  be 


H 

p  (x.t)  =  1  -  2  b  h(x,t,0)  cos(md)  cos(n0)  sin(0/2) 

n  mJ 


N  T 

r  (x.t)  =  2  b  g(x,t,0)  cos(m0)[cos(n0)  -  cos(n-l)0)]d0 

n  m  J 


where  the  functions  h(x,t,0)  and  g(x,t,0)  are  given  by 


h(x,t,0)  =  1  -  ~  ja>exp(-tr)  exp[-xj^—  cos(g^~l^>)]  sin[X^ 


sm(2p>]^  . 


g(x.t .0)  a  - 


exp(-tr )  exp[-xj—  cos(^-^-)]  sin[xj^- 


The  quantities  X,  p  and  have  the  following  evaluations: 


X  =  2  sin(0/2)  x  , 


p  =  J[ra  cos(cnr)  +  T(a+1)]^  +  [ra  sin(cnr)]* 


<P  =  tan 


H — ra  sln(CT> - 1  ,  o<* 

*•  ra  cos(onr)  +  T(a+1)  ■* 


Numerical  integration  of  the  above  formulae  has  been  carried  out  for  both 


Pn  and  t  ,  even  though  they  are  related  through  (1).  The  reason  for  this  is 


that  pn  is  usually  the  quantity  of  primary  interest  and  the  numerical  evaluaton 


of  t  from  p  involves  dif ferentiaton  which  should  be  avoided, 
n  n 


Numerical 


integration  has  been  done  by  using  a  midpoint  Romberg  integration  technique, 
with  an  appropriate  change  of  variables  at  the  singular  points  of  the 
integrands.  The  results  are  plotted  in  Figs  3,  4,  5  and  6,  for  one  and  three 
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broken  fibers  and  for  the  first  and  second  intact  fibers  for  various  times  (a  =  j 

0.1  for  all  cases).  i 

From  Figs  3  and  4  we  notice  that  at  x  =  0  we  recover  the  overload  [ 

coefficients  (P (x=0,t)/P  s  p  (x=0,t))  in  accordance  with  the  elastic  solution  | 

n  w  n  I 


of  HEDGEPETH  (1961).  The  overload  coefficient  of  the  first  Intact  fiber  in  a  I 

laminate  with  N  neighboring  breaks  as  calculated  by  Hedgepeth  is  given  by  j 

i 


if  _  4  *  6  *  8  *  ;  »  (2N  +  4) 
Ti  =  3  •  5  •  7  •  •  ,*  (2N  +  3$ 


0  i  N  i  « 


(44) 


The  above  formula  holds  for  the  viscoelastic  case  as  well  because  the  overall  { 

static  equilibrium  of  the  composite  is  not  affected  by  the  viscoelastic  j 

properties  of  the  matrix  material.  This  is  because  the  matrix  material  cannot  i 

sustain  normal  loads  in  the  x  direction  and  there  is  no  stress  relaxation  in  ! 

the  fibers  as  they  are  assumed  to  be  elastic.  Therefore,  the  excess  load  j 

caused  by  the  simultaneous  breaks  has  to  be  shared  by  the  neighboring  intact 
fibers  and  only  the  stress  distributions  are  affected  by  the  viscoelastic 
properties  of  the  matrix  material. 

Several  observations  can  be  drawn  from  Figs  3  and  4.  The  slope  of  the 
stress  distribution  in  the  fibers  decreases  as  time  increases,  resulting  in  a 
growth  of  the  effective  load  transfer  length  1^  with  time  (Figs  3a.  4a).  The 

overload  undershoots  and  actually  becomes  negative  before  it  decays  to  zero  as 
x  -*  «°  for  the  intact  fibers.  Global  equilibrium  of  the  composite  in  the  x 
direction  implies  that  2[pn(x,t)  -  1]  =  0,  with  summation  extending  to  all 

fibers.  Since  the  negative  overloads  in  the  broken  fibers  grow  with  time,  as  a 
result  of  the  shear  stress  relaxation  in  the  matrix,  the  positive  overloads  in 
the  intact  fibers  increase  with  time  for  fixed  x.  so  that  global  equilibrium  is 
satisfied  (see  Figs  3  and  4).  This  implies  that  the  probability  of  failure  for 
the  intact  fibers  near  breaks  increases  with  time.  The  length  over  which  this 
increased  probability  occurs  also  grows  with  time,  this  being  the  effective 
load  transfer  length  lf. 

The  relaxation  of  the  shear  stresses  in  the  matrix  can  be  seen  in  Figs  5 
and  6.  The  shear- lag  model  gives  inaccurate  results  for  the  shear  stresses 
near  the  breaks  (whithin  one  or  two  fiber  diameters).  The  shear  stresses  in 
the  matrix  should  go  to  zero  at  the  break  points  and  this  is  clearly  violated 
according  to  the  numerical  results  in  Figs  5a  and  6a.  Modifications,  like  for 
example  the  correction  in  the  calculation  of  the  shear  strain  introduced  by 
ERINGEN  and  KIM  (1974),  are  consistent  with  continuum  mechanics  but  in  reality 
they  are  not  accurate  either.  The  reason  for  this  is  that  debonding  in  the 
fiber-matrix  interface  near  the  breaks  usually  occurs  due  to  the  high  stress 
concentrations  there.  This  changes  drastically  the  geometry  in  a  small 
neighborhood  around  the  breaks,  and  leads  to  additional  plastic  deformations  in 
the  matrix.  Nevertheless,  finite  element  results  for  the  elastic  shear- lag 
model  (REEDY,  1984)  indicate  that  the  shear- lag  model  predicts  correctly  the 
stress  concentrations  in  the  intact  fibers.  Even  though  it  is  an  approximate 
model,  the  shear- lag  model  for  the  viscoelastic  case  unravels  the  trend  in  the 
time  dependence  of  the  stress  fields  near  broken  fibers.  Note  that  since  the 
fibers  are  much  stiffer  than  the  matrix  (3100) .  the  region  in  which  the 
stresses  are  perturbed  due  to  fiber  breaks  is  50  or  more  fiber  diameters,  while 
the  shear-lag  analysis  fails  to  predict  correctly  the  shear  stresses  in  a  small 
region  of  one  or  two  fiber  diameters  away  from  the  breaks. 
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Fig.  3a. 


Fig.  3b. 
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ELEMENT  LEVEL  ELIMINATION  OF  NONLINEAR  CONSTRAINTS 


IN  TOTAL  LAGRANGIAN  FINITE  ELEMENT  FORMULATIONS 
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Watertown,  MA  02172-0001  USA 


SUMMARY 


Nonlinear  constraints  in  elastic  finite  deformation  theory  can  be  enforced  by 
an  iterative  element  level  variable  elimination  method  which  takes  advantage 
of  the  finite  element  discretization.  A  Lagrangian  potential  energy  method  is 
used  and  load  steps  are  taken  small  enough  so  that  the  potential  energy  is 
nearly  quadratic  when  expanded  as  a  function  of  displacement  increments.  The 
Newton  -  Raphson  method  is  used  to  find  minimal  locations.  Element  gradient 
and  tangent  matricies  are  computed  and  modified  to  be  consistent  with  an 
incremental  representation  of  the  nonlinear  constraint.  This  iterative 
variable  elimination  method  is  used  to  determine  the  solutions  to  the  bending 
of  an  elastica  around  an  ellipse  for  aspect  ratios  of  0.75,  1.00  and  1.50. 
Two  exterior  methods  are  also  used  to  solve  these  problems  for  comparison. 
The  Lagrange  multiplier  method  (ABAQUS  code)  and  a  penalty  method  are  used. 
The  results  obtained  using  the  element  level  elimination  method  are  compared 
to  the  results  obtained  using  ABAQUS  and  the  penalty  method. 


Ill 


iV 


INTRODUCTION 


Many  problems  in  solid  and  fluid  mechanics  involve  finding  either  a  minimum 
or  stationary  value  of  an  energy  functional  such  that  an  additional 
constraint  equation  remains  valid.  Frictionless  contact  problems  which 
involve  large  elastic  deformations  and  curved  rigid  contact  surfaces  are 
considered  here.  Applications  include  contact  between  long  thin  metal  or 
paper  items  being  passed  through  channels  and  rigid  smooth  surfaced  indentors 
penetrating  rubberlike  solids.  When  these  problems  are  formulated  using  the 
finite  element  method  the  enforcement  of  the  constraint  equations 
(description  of  the  contact  surface)  is  usually  the  cause  of  difficulties. 
The  minimization  problem  is  often  modified  by  attaching  the  constraint 
equations  using  either  the  Lagrange  multiplier  technique  or  a  penalty  method. 
When  the  minimization  problem  is  quadratic  in  the  displacement  variables  and 
the  constraint  equations  are  linear,  elimination  methods  are  often  used.  This 
suggests  that  when  the  nonlinear  minimization  problem  can  be  made  nearly 
quadratic  an  elimination  method  may  be  possible.  An  appropriate 
representation  of  the  nonlinear  constraints  is  necessary  which  will  allow 
variables  to  be  eliminated  from  the  minimization  problem  and  approximately 
incorporate  the  constraint.  We  briefly  describe  the  nonlinear  minimization 
problem  associated  with  the  large  deformations  of  a  cantilever  beam,  the 
'elastica'.  The  minimization  problem  is  then  modified  by  attaching  the 
constraint  that  the  elastica  bend  around  a  rigid  frictionless  elliptical 
surface.  General  methods  for  solving  this  constrained  minimization  problem 
are  reviewed  to  provide  background  and  to  provide  methods  to  compare  to  the 
element  level  elimination  method  proposed  here. 

Finite  element  formulations  for  large  deformations  of  beams  exist  in 
several  forms[l-5].  We  selected  Fried's[3]  formulation  since  it  is  presented 
as  a  nonlinear  minimization  problem  in  terms  of  configuration  variables. 
Contact  surfaces  can  be  described  in  terms  of  these  variables.  Also,  the 
nonlinear  'B23'  element  in  the  ABAQUS[5]  code  can  model  the  same  problem 
allowing  for  independent  comparisons.  Analytical  solutions  are  not  available. 
The  nonlinear  programming  problem  which  we  are  concerned  with  can  be 
presented  as  follows: 

Minimize:  f({u})  ;  {u}  C  R^  (1) 

Such  that:  g^({u})  20  j  ■  1,...,J 

where  {u}  *  the  global  set  of  nodal  variables, 

f({u})  *  a  nonquadratic  potential  energy  function, 
and  g.({u})  «  a  differentiable  function  describing  the  j'th  contact 

J  surface. 

Frictionless  nonlinear  contact  problems  can  be  represented  by  equation  ( 1 ) .  A 
large  amount  of  information  is  available  on  methods  for  treating  these 
problems.  We  present  a  brief  summary  of  several  methods  used  so  that  they  can 
be  compared  to  the  iterative  element  level  elimination  method. 

Lagrange  multipliers  are  used  to  attach  the  constraint  equation  to  the 
function  being  minimized! 5- 13) .  Although  this  method  is  not  attractive  from  a 
theoretical  point  of  view,  since  it  introduces  possible  saddle  points,  it  has 


found  widespread  use  because  the  Lagrange  multipliers  represent  contact 
pressures.  This  method  solves  the  nonlinear  programming  problem  described  by 
equation  (1)  by  formulating  it  as  follows. 

J 

Stationary  points:  f({u})  +  ^  X^g^  ;  {u,X}CRN+J  (2) 

J-l 


where  X^  ■  j'th  Lagrange  multiplier. 

There  are  many  methods  used  to  construct  representations  like  equations  (2) 
and  to  solve  them.  The  paper  by  Simo,  Uriggers  and  Taylor[ll]  describes  the 
Lagrange  multiplier  method  in  detail  and  introduces  a  perturbed  form  which  is 
a  mixed  penalty  /  Lagrange  multiplier  method. 

The  penalty  method[6, 14-18]  also  attaches  the  constraint  equations  to 
the  function  being  minimized.  In  doing  so,  it  maintains  a  minimization 
problem.  No  variables  are  added  to  the  analysis  set  but  the  large  penalty 
parameters  needed  can  cause  ill  -  conditioning  of  the  modified  function's 
tangent  matrix.  The  problem  given  by  equation  (1)  is  solved  using  the  penalty 
method  as  follows. 


J 

Minimize:  f'({u})  -  f({u>)  +  ^  Yj  8j  »  (u}CRN  (3) 

J-l 


where  y .  »  j'th  large  constant  which  may  depend  on  the  tangent 
^  matrix  of  f({u}).  (See  references  17,18). 

Nonlinear  programming  problems  can  sometimes  be  made  to  look  like  a 
quadratic  programming  problem  if  sucessive  trial  vectors  {u}  are  sufficiently 
close[19,20] .  If,  in  addition,  the  constraint  equations  can  be  linearized, 
then  the  revised  simplex  method  method  for  quadratic  programming  can  be  used 
in  an  iterative  scheme  to  solve  equation  (1)[ 19-21] .  In  this  case  the  problem 
is  formulated  as  follows. 


->  Minimize:  f'({u})  ■  {u}T[K  ]{u}  -  (P  }T{u) 

X  o  o 


Such  that:  [A]{x>  -  (b)  l  0 

Set:  {u  }  ■  (u) 

o 

' -  Repeat  until  contact  set,  {x},  does  not  change. 


(4) 
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where  [K  ] 


<v 

{p  }T  -  {u  >T[K  ], 
o  o  o 

[A]{x}  -  {b}  ■  the  linearization  of  equations  g^ ,  in  (1) 

{x} C  {u}  is  a  trial  set  of  variables  in  the  contact  set, 

{u}  ■  a  trial  vector  near  {u  }, 

o 

and  {uq}  *  a  vector  which  minimizes  f({u})  but  does  not  satisfy 

the  constraints. 

This  method  can  be  useful  when  the  functions  {f,g.}  can  each  be  expanded  in  a 
Taylor  series  and  when  small  changes  in  the  constraint  set  are  expected. 

A  less  often  used  method  of  enforcing  constraints  during  minimization  is 
to  solve  the  J  constraint  equations  for  the  relations  for  J  variables  (in 
terms  of  the  remaining  N-J  variables).  The  relations  are  then  substituted 
into  the  function  being  minimized.  That  is,  they  are  eliminated[6] .  This 
yields  an  unconstrained  minimization  problem.  The  method  is  given  as  follows. 

Solve  equations  gj((u})  -  0  j«  1....J 

and  get  x.  -  F  ({v})  j«  1....J  (5) 

N-J  J  J 

where  (v)  C  R 

Substitute  (5)  into  (1)  to  obtain  the  unconstrained  minimization 

problem.  N_ . 

Minimize:  f({v})  ;  {v}  C  RW  (6) 

We  have  intentionally  presented  elementary  descriptions  of  the  above 
methods  so  that  the  relationship  between  these  methods  and  the  element  level 
elimination  proposed  here  can  be  easily  identified. 

ELEMENT  LEVEL  ELIMINATION  METHOD 

The  elimination  method  (eqns  5,6)  is  typically  not  used  when  the 
constraints  are  nonlinear  since  it  is  difficult  to  determine  the  functions 
F^({v})  given  by  equation  (5).  Also,  when  the  minimization  problem  involves 
many  variables,  as  in  a  finite  element  problem,  it  is  difficult  to  automate  a 
nonlinear  elimination  method.  The  method  we  propose  here  avoids  the 
difficulties  associated  with  determining  equation  (5)  by  working  with  the 
derivatives  of  the  constraint  equations.  We  return  to  solving  equation  (1) 
with  {u}  equal  to  the  vector  of  element  nodal  variables.  Expanding  (1)  in  a 
Taylor  series  we  have: 

f({u})  •  f({uQ)  +  (Au) ) 

*  fQ  +  {gQ}T{Au}  +  -y-  {Au}T[KoHAu}  +  ...  (7) 
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y. 


where 


{u}  ■  a  vector  "near"  {uq}  which  is  closer  to  the  minimum, 
{uq}«  the  current  location 

{Au}  ■  {u}  -  (u  } 
o 

/  t  9f 

{go}  "  3{u} 


tK  J  - 5 

°  Hu}2 


Since  f  is  derived  from  an  energy  principle  and  has  been  discretized  by  the 
finite  element  method,  we  can  express  equation  (7)  as 


«<“»  ■  l  («„  -  (S./14V  +  4-  Uue}T[KoeH%}  *  ...  ) 


Asstuning  the  constraint  equations  in  (1)  are  differentiable  we  have 
dg .  -  0 

For  simplicity  we  assume  one  constraint  equation  (i.e.  J*1 ) 

Then,  we  can  write  (9)  as 


r  38 

L  cr* i 


where  {x}  c  {u} 

For  small  displacement  increments 


r  38 

L  3x7 

u  i 


Ax  *  0 


Solving  (11)  for  a  displacement  increment  Ax^  in  the  set  (x)  we  have 


j-  y  !Lta 

3g  l  3x .  “i 
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This  suggests  that  we  can  eliminate  Ax^  in  favor  of  g({x})  equal  to  zero  at 
the  element  level.  That  is. 


{Au  }  ■  [A  ]{Au  } 
e  e  er 


for  an  element 


where  [A  ]  -  [A  ({u  })] 
e  e  e 


{u  }  *  {Au  }  -  {Ax.} 
er  e  £ 


Thus,  [A  ]  represents  a  constraint  matrix  which  depends  on  the  displacements 


and  {Au  }  the  reduced  set  of  element  variables.  Substituting  (13)  into  (8) 
we  haveer 


f({u  »  -  y  (<f  +  {g  )T[A  ] {Au  }  + 

c  l  \  eo  eo  e  er 


{Au  }T[A  ]T[K  ][A  ]{Au  }  +  ...  ) 
l  er  e  oe  e  er  J 


In  equation  (14)  we  identify  the  reduced  element  gradient  and  tangent 
matrices  as 


{g  }  -  [A  ]T{g  } 
er  e  _  eo 

[K  ]  -  [A  ]T[K  ] [A  ] 
er  e  eo  e 


Global  gradient  and  tangent  matrices  can  now  be  assembled  in  the  standard  way 
for  the  "reduced  incremental  variable  set".  The  Newton  -  Raphson  method  is 
then  used  to  find  {u}.  The  £,_  norm  of  the  reduced  gradient  is  checked  at  the 
new  location.  If  not  zero  then  {u}  is  set  to  {u  }  and  the  process  is 


repeated.  It  should  be  noted  that  the  x^  associated  with  the  eliminated  Ax^ 
must  be  updated  by  solving  a  one  dimensional  nonlinear  equation  obtained  from 


the  constraint  equation  at  each  iteration. 

A  rule  must  be  made  for  determining  when  a  variable  which  is  a  member  of 
the  constrained  set  must  be  released.  That  is,  when  should  a  point  in  the 
contact  set  be  released?  We  can  write 


.r  3f  .  ,  3f  .  ,  ,  3f  .  c  iT,  ■, 

Af*  r —  Au.  +  r —  Au_  +  ...  +  - —  Au  «  [g]  (Au) 

3u,  1  3u_  2  3u  n  ° 

12  n 


where  u^  c  {u} 


which  simply  states  that  {g}  contains  information  on  how  f({u})  changes  with 
respect  to  {u}.  If  we  consider  changes  in  one  variable  at  a  time  we  obtain 


->  negative  Au^  decreases  f 


->  positive  Au^  decreases  f 


VVVV 


% 


rs! 

£ 


The  rule  can  then  be  stated  as: 


RELEASE  RULE: 


Find  the  directiog  to  decrease  the  energy.  Assume  it  is  Au,  . 
Then,  if  u.  +  Au.  does  not  satisfy  the  constraint  equation,  keep 
u^  in  the  constrained  set.  Otherwise,  release  it. 


In  the  case  of  two  dimensional  contact  with  physical  variables  x  and  y  at  the 
nodes  and  corresponding  unit  vectors  i,  3  the  release  rule  can  be  simplified 
as  follows,  see  Figure  1. 


*•  3f*  3f_  » 

Vi '  hi 


gradient  terms  for  node  i 


n  ■  unit  normal  to  contact  surface 


Then,  the  release  rule  for  two  -  dimensional  contact,  with  (x,y)  nodal 
variables,  becomes: 


RELEASE  RULE  FOR  TWO  -  DIMENSIONAL  CONTACT 


If  a  >  0  Keep  in  constraint  set. 

If  a  <  0  Release  from  constraint  set. 


We  note  that  a  equals  the  component  of  generalized  force  outward  from  the 


surface  and  g 


equals  the  contact  force. 


ELASTICA  BENDING  AROUND  AN  ELLIPSE 


We  selected  the  problem  of  an  elastica  bending  around  an  ellipse,  as 
shown  in  Figure  2,  to  demonstrate  the  element  level  elimination  algorithm. 
The  aspect  ratio,  a,  is  varied  to  obtain  different  contact  problems.  If  "a" 
is  large  then  the  contact  region  changes  rapidly  with  a  small  change  in  load, 
P.  When  "a"  is  small  a  large  load  is  needed  for  initial  contact  and  the 
contact  region  changes  more  slowly  with  increasing  load  P.  Contact  solutions 
were  obtained  for  aspect  ratios  of  0.75,  1.00  and  1.50.  Here,  we  present  some 
details  on  how  the  solutions  were  obtained.  First  we  show  the  ABAQUS  solution 
(Lagrange  multiplier  method),  second  a  penalty  method  and  third  the  element 
level  elimination  algorithm. 

For  all  solution  methods,  the  elastica  was  of  length  it.  One  end  of  the 
elastica  was  fixed  at  the  origin  and  a  vertical  load  P  was  applied  to  the 
other  end.  For  each  aspect  ratio  of  the  ellipse,  the  load  history  used  is 
summarized  in  Table  1.  Young's  modulus  was  one  and  Poisson's  ratio  was  zero. 
To  approximate  an  inextensible  elastica,  a  ratio  of  the  cross  sectional  area 
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to  the  moment  of  inertia  equal  to  10  was  selected[ 19,20] .  The  elastica  was 
discretized  into  forty  elements.  Two  noded  beam  elements  with  cubic 
interpolation  functions  were  chosen  to  model  the  elastica.  For  the  element 
level  elimination  method  and  the  penalty  method  a  beam  element  developed  by 
Fried[8]  was  used.  Each  node  has  four  degrees  of  freedom,  see  Figure  3.  In 
this  element  the  axial  strain  is  based  on  the  definition  of  engineering 
strain.  For  the  Lagrangian  multiplier  method  a  beam  element  ('B23*  element) 
with  three  degrees  of  freedom  per  node  was  selected  in  ABAQUS.  A  Lagrangian 
axial  strain  is  used  in  this  element  formulation. 

Lagrange  Multiplier  Method 

To  solve  the  problem  of  an  elastica  bending  around  an  ellipse  using  the 
Lagrange  multiplier  method,  the  finite  element  code  ABAQUS  was  used.  A  two 
noded  planar  interface  element  ('IRS21'  element)  was  chosen  to  detect  contact 
between  the  beam  element  and  the  rigid  surface  or  ellipse.  This  contact 
element  enforces  a  linear  pressure  distribution  between  nodes  and  has 
integration  points  at  the  nodes.  The  material  properties  cited  above  were 
input  using  the  'general  beam  section'  option.  Tolerances  were  set  at  0.4%  of 
the  applied  load  and  at  one  percent  of  the  moment.  Smaller  tolerances  did  not 
change  the  ABAQUS  finite  element  solution.  The  ellipse  was  defined  by  the 
user  subroutine  RSURFU.  At  each  integration  point  of  the  planar  interface 
element,  the  penetration  distance  into  the  ellipse  is  calculated.  To  do  this 
the  coordinates  of  the  point  on  the  ellipse  closest  to  the  integration  point 
must  be  found.  The  direction  cosines  of  the  tangent  and  the  rate  of  change  of 
the  tangent  along  the  surface  at  this  point  on  the  ellipse  are  also 
determined . 

To  find  the  point  on  the  ellipse  closest  to  a  given  point  along  the 
elastica,  the  Newton  -  Raphson  method  was  used.  Given  the  equation  for  a 
point  (x,y)  on  the  ellipse,  we  need  to  minimize  the  distance  between  the 
point  (x,y)  and  the  integration  point(x  ,y  )  along  the  elastica.  This  can  be 
done  using  the  same  elimination  method  we  use  to  define  surface  contact.  We 
proceed  as  follows. 

Minimize:  n  ■  (x  -  x  )2  +  (y  -  y  )2 

o  o 

2 

Given:  f(x)  -  +  (y  +  1)  «  1 

& 

Reduce  the  variables  (x,y)  to  x  using  the  constraint  equation. 

Ay  »  -r — -  Ax  (21) 

a2( 1  +  y) 


(distance)  (19) 

(constraint  equation)  (20) 


From 


'  Ax 

i 

.  Ay . 

X 

a2(y  +  1) 

{Ax} 


(22) 


That  is,  we  have  {Au}  *  [AHAu^}  in  (22). 
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Using  equation  (19),  the  gradient  and  tangent  stiffness  matrices  are  derived. 


[K]  - 


2(x  -  xa) 


2(y  -  y0) 


2  0 


0  2 


Next  we  solve  for  the  reduced  gradient  and  tangent  stiffness  matrices. 
(gr)  -  [A]T{g> 

2x(y  -  y  ) 

-  2(x  -  x)  +  — = - 2. 


a  (y  +  1) 


[KJ  «  [A]a[K][A] 


a4(y  +  l)2 


Applying  the  Newton  -  Raphson  method  to  solve  (24)  using  an  initial  guess  for 
(x1,y1)  yields 


(x2)  «  (Xj)  -  [Kr(x1)]’1{gr(x1)} 


or ,  here 


X2  *  X1 


*i(y,  -  yQ) 

(x  -  x  )  +  — i-i - — 

°  a2(v  +  1) 


1  + 


a4(y1  +  l)2 


To  update  the  y  coordinate,  we  use  the  constraint  equation.  That  is, 


y2  -  -i  +  i 


This  process  is  repeated  until  the  reduced  gradient  approaches  zero. 
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n  '  ■  n  +  )  Y  ((x  -  x  )2  +  (y  -  y  )2) 

e  e  L  np  n  e  wn  Je 


(30) 


vhere 


Ynp  m  "  t^'e  n'th  penalty  parameter 


p  ■  a  variable  for  convergence  studies 
1 


-y  if  one  or  no  adjacent  nodes  have  violated  the 
constraint  equation 


1  if  both  adjacent  nodes  have  violated  the  constraint 
equation 


nn 


the  diagonal  term  from  the  tangent  stiffness  matrix 


associated  with  the  displacement 


<vV 


the  point  on  the  ellipse  closest  to  the  node  (x  ,y  )  on 
the  elastica.  This  point  is  found  using  the  solution 


method  described  in  the  previous  section. 


The  computation  of  the  element  gradient  and  tangent  stiffness  matrices  are 
shown  below. 
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an  • 

where  - -  *  2(x  -  x  ) 

3x  no 
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(in  row  for  x  ) 
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an  ' 
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(in  row  for  y  ) 


Similarly, 


a2n  ' 

[k  '] - s-2 
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where 
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usual  manner.  The  Nevrton  -  Raphson  method  is  applied  to  minimize  the  global 
form  of  equation  (30).  To  study  convergence,  the  analysis  was  completed  with 
values  of  10f  equal  to  10E-5,  10E-2,  and  10E-1. 


Element  Level  Elimination 


For  this  method,  we  need  to  define  the  matrix  [A  ]  from  equation  (13). 
From  the  constraint  equation  (20)  we  can  compute  Ay  in  \erms  of  Ax. 


a2(y+l) 


Ax  *  F(x)  Ax 


Thus,  when  a  node  on  the  elastica  is  in  the  contact  set  we  can  compute  the 
element  energy  gradient  and  tangent  matrices  for  which  the  relation  (33) 
holds  by  using  equation  (IS).  After  assembly,  the  number  of  variables  in  the 
global  gradient  and  tangent  matrices  will  be  reduced  by  the  number  of  nodes 
in  the  contact  set. 


When  the  node  in  contact  is  the  first  node  along  the  element,  [A  ] 
becomes : 
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where  Uuer}T  «  {  xj  x*  x*  %2  y‘  y2> 

When  the  second  node  of  the  element  is  in  contact  the  new  [A  ]  can  be  readily 
seen  from  equation  (34).  When  both  nodes  of  the  element  violate  the 
constraint,  [A^]  reduces  to: 

1  0  0  0  0  0  ' 

0  1  0  0  0  0 

F(x)  0  0  0  0  0 

[A  ]  -  0  0  1  0  0  0  (35) 

e 

0  0  0  1  0  0 

0  0  0  0  1  0 

0  0  0  0  F(x)  0 

0  0  0  0  0  1 

811(1  tuer}T  ■  {xl  *1  $1  *2  *2  *2J 

The  reduced  element  gradient  and  tangent  matrices  can  then  be  computed  using 
equation  (15). 

RESULTS 

To  demonstrate  the  use  of  the  element  level  elimination  method,  the  problem 
of  an  elastica  bending  around  a  frictionless  rigid  surface  in  the  form  of  an 
ellipse  was  solved.  For  comparison,  this  contact  problem  was  also  solved 
using  the  Lagrange  multiplier  (ABAQUS)  and  penalty  methods.  Similar  results 
were  obtained  for  all  methods.  The  displacements  both  in  the  contact  region 
and  at  the  tip  were  in  good  agreement. 

A  comparison  of  the  deformed  configuration  of  the  elastica  bending 
around  a  circle  (a*1.00.  Figure  2),  at  a  load  of  0.55,  shows  close  agreement 
between  the  element  level  elimination  and  Lagrange  multiplier  methods,  see 
Table  2.  The  deformed  configurations  at  loads  of  0.60,  1.00,  and  1.25  are 
shown  in  Figure  5.  The  elastica  first  made  contact  with  the  circle  at  a  load 
of  0.55.  The  contact  solutions  do  have  some  differences.  The  ABAQUS  solution 
and  penalty  solution  show  regional  contact  while  the  element  level 
elimination  method  shows  point  contact.  When  two  nodes  lie  along  the  circle, 
the  elastica  is  in  contact  with  the  circle  at  some  point  between  those  nodes. 
Thus,  a  solution  with  two  nodes  in  contact  implies  point  contact  with  the 
rigid  surface.  The  location  of  the  contact  surface  on  the  circle  vs  load  is 
shown  in  Figure  6.  The  final  node  in  contact  at  a  given  load  is  the  same, 
refer  to  Table  3.  The  regional  solution  obtained  by  ABAQUS  and  the  penalty 
method  at  a  given  load  was  also  an  iterative  solution  obtained  by  the  element 
level  elimination  method.  By  considering  energy  minimization  and  using  the 
Release  Rule  for  Two  -  Dimensional  Contact,  the  additional  contact  points 
found  by  the  other  methods  were  released.  These  released  points  lie  close  to 
the  circle;  the  maximum  distance  between  a  node  and  the  circle  was  10E-5.  The 


ABAQUS  pressure  forces  for  these  additional  nodes  in  contact  were  an  order  of 
magnitude  lower  than  the  largest  pressure  force. 

For  aspect  ratios  of  0.75  and  1.50,  the  contact  solution  was  identical 
for  all  methods.  (Refer  to  Tables  4  and  5.)  The  elastica  makes  initial 
contact  with  the  ellipse  at  a  load  of  1.70  and  has  regional  contact  when  the 
aspect  ratio  is  0.75.  For  an  aspect  ratio  of  1.50,  initial  contact  occurs  at 
a  load  of  0.17  and  point  contact  results.  Deformed  configurations  of  the 
elastics  at  various  loads  for  the  aspect  ratios  of  0.75  and  1.50  are  shown  in 
Figure  8.  A  summary  of  the  results  for  the  element  level  elimination  contact 
solution  is  found  in  Figure  9. 

Tables  3  through  5  also  demonstrate  the  influence  of  the  penalty 
parameter  on  the  contact  solution.  When  the  penalty  parameter  equals  10E-5, 
the  optimal  solution  is  not  obtained.  The  distance  between  the  nodes  in 
contact  and  the  ellipse  is  10E-2.  When  the  penalty  term  is  increased,  the 
contact  set  becomes  smaller  and  the  distance  bewteen  the  nodes  in  contact  and 
the  ellipse  is  10E-5.  At  higher  values,  though,  problems  with  convergence  and 
"chattering",  and  oscillation  between  two  different  contact  solutions  was 
observed. 

SUMMARY 

An  element  level  elimination  algorithm  for  the  analysis  of  frictionless 
geometrically  nonlinear  constraint  problems  was  presented.  This  algorithm  is 
easy  to  implement  within  a  finite  element  code.  The  release  of  a  nodal 
variable  from  the  constraint  set  is  based  on  energy  minimization  principles. 
Ill  -  conditioning  of  the  tangent  stiffness  matrix  is  avoided.  To  demonstrate 
this  algorithm,  the  problem  of  an  inextensible  elastica  bending  around  an 
ellipse  was  solved.  For  comparison,  solutions  to  this  problem  were  also 
obtained  using  a  penalty  method  and  the  Lagrange  multiplier  (ABAQUS)  method. 


REFERENCES 


1.  Y.  Toda  and  G.C.  Lee,  'Finite  element  solution  to  an  elastica  problems  of 
beams',  Int.  J.  Num.  Meth.  Engng.  2,  229-241  (1970). 


2.  I.  Fried,  'Finite  element  computation  of  large  elastic  deformations'.  The 
Mathematics  of  Finite  Elements  and  Applications  IV,  MAFELAP  1981,  ed. 

J.R.  Whiteman,  Academic  Press  1982. 


3.  I.  Fried,  'Nonlinear  finite  element  computation  of  the  equilibrium, 

stability  and  motion  of  the  extensional  beam  and  ring'.  Comp.  Meth.  Appl. 
Mech.  Eng.  38,  29-44  (1983). 


4.  B.W.  Golley,  'The  finite  element  solution  of  a  class  of  elastica 
problems'.  Comp.  Meth.  Appl.  Mech.  Eng.  46,  159-168  (1984). 


5.  ABAQUS  Version  4.5,  Hibbitt,  Karlsson,  and  Sorenson,  Providence,  R.I., 
1987. 


6.  G.V.  Reklaitis,  A.  Ravindran,  and  K.M.  Ragsdell,  Engineering  Optimization 
Methods  and  Applications,  John  Wiley  and  Sons,  New  York,  1983. 


124 


7.  T.J.R.  Hughes,  R.L.  Taylor,  and  W.  Kanoknnukulchai,  'A  finite  element 
method  for  large  displacement  contact  and  impact  problems'.  Formulations 
and  Computational  Algorithms  in  Finite  Element  Analysis,  eds.  K.J.  Bathe, 
J.T.  Oden  and  W.  Wunderlich,  MIT  (1976). 

8.  J.  Tseng  and  M.D.  Olson,  'The  mixed  finite  element  method  applied  to  two  - 
dimensional  elastic  contact  problems',  Int.  J.  Num.  Meth.  Engng.  17, 
991-1014  (1981). 

9.  O.S.  Narayanasuarmy,  'Processing  nonlinear  multipoint  constraints  in  the 
finite  element  method',  Int.  J.  Num.  Meth.  Engng.  21,  1283-1288  (1985). 

10.  K.J.  Bathe  and  A.  Chaudhary,  'A  solution  method  for  planar  and 
axisymmetric  contact  problems',  Int.  J.  Num.  Meth.  Engng.  21,  65-88 
(1985). 

11.  J.C.  Simo,  P.Wriggers  and  R.L.  Taylor,  'A  perturbed  Lagrangian 
formulation  for  the  finite  element  solution  of  contact  problems'.  Comp. 
Meth.  Appl.  Mech.  Eng.  50,  163-180  (1985). 

12.  B.  Nour  -  Omid  and  P.  Wriggers,  'A  two  level  iteration  method  for 
solution  of  contact  problems'.  Comp.  Meth.  App.  Mech.  Eng.  54,  131-144 
(1986). 

13.  P.P.  Lazaridis  and  P.D.  Panagiotopoulos ,  'Boundary  variational  principles 
for  inequality  structural  analysis  problems  and  numerical  applications' , 
Comput.  Struct.  25,  35-49  (1987). 

14.  J.N.  Reddy,  'The  penalty  function  method  in  mechanics  a  review  of  recent 
advances'.  Penalty  -  Finite  element  Methods  in  Mechanics,  AMD  -  Vol.  51, 
ASME,  1982,  ed.  J.N.  Reddy. 

15.  N.  Kikuchi,  'A  smoothing  technique  for  reduced  integration  penalty 
methods  in  contact  problems',  Int.  J.  Num.  Meth.  Engng.  18,  343-350 
(1982). 

16.  T.  Endo,  J.T.  Oden,  E.B.  Becker  and  T.  Miller,  'A  numerical  analysis  of 
contact  and  limit  -  point  behavior  in  a  class  of  problems  of  finite 
elastic  deformations',  Comput.  Struct.  18,  899-910  (1984). 

17.  C.A.  Felippa,  'Error  analysis  of  penalty  function  techniques  for 
constraint  definitions  in  linear  algebraic  systems',  Int.  J.  Num.  Meth. 
Engng.  11,  709-728  (1977). 

18.  C.A.  Felippa,  'Iterative  procedures  for  improving  penalty  function 
solutions  of  algebraic  systems',  Int.  J.  Num.  Meth.  Engng.  12,  821-836 
(1978). 

19.  A.R.  Johnson  and  C.J.  Quigley,  'Buckled  elastica  in  contact  -  finite 
element  solutions'.  Transactions  of  the  Second  Army  Conference  on  Applied 
Mathematics  and  Computing,  NTIS  (AD  -  P004904),  February  1985. 


rjuvrvL  w,:s  v  ?wor  rwww?'  r\i  vj  rjrjrjn  *%  xvTi.w'i*  v*  k.1  v  v.*  i1  iw  v  r  w  tcrsr  v  Krug 


20.  A.R.  Johnson  and  C.J.  Quigley,  'Frictionless  geometrically  nonlinear 
contact  using  quadratic  programming',  submitted  to  Int.  J.  Num.  Meth. 
Engng. 


21.  H.H.  Rusin,  'A  revised  simplex  method  for  quadratic  programming,  SIAM  J. 
Appl.  Math.  20(2),  143-160  (1971). 


22.  R.  Chand,  E.J.  Haug  and  R.  Kim,  'Analysis  of  unbonded  contact  problems  by 
means  of  quadratic  programming',  J.  Opt.  Theory  Appl.  20(2),  171-189 
(1976). 


23.  E.  Haug,  R.  Chaud  and  K.  Pan,  'Multibody  elastic  contact  analysis  by 
quadratic  programming',  J.  Opt.  Theory  Appl.  21(2),  189-198  (1977). 


Table  2  Deformed  Configuration  of  the  Elastica  Bending  Around  a  Circle 
at  a  Load  of  0.55  (  a  ■  1.00,  Figure  2.) 


Node 

Degree  of 
Freedom 

Element  Level 
Elimination 
Solution 

ABAQUS 

Solution 

1 

X 

0.00 

0.00 

Y 

0.00 

0.00 

e 

0.00 

0.00 

2 

X 

7.84601E-2 

7.84585E-2 

(contact) 

Y 

-3.0827E-2 

-3.C826E-2 

© 

-7.8116E-2 

-7.8113E-2 

3 

X 

0.15645 

0.15645 

Y 

-1.2168E-2 

-1 . 2168E-2 

e 

-0.1533 

-0.1533 

4 

X 

0.23357 

0.23357 

Y 

-2.6955E-2 

-2. 6955E-2 

• 

• 

© 

• 

• 

• 

• 

• 

-0.2251 

• 

• 

• 

• 

• 

-0.2251 

• 

• 

• 

» 

• 

41 

• 

• 

• 

• 

• 

• 

X 

• 

• 

• 

• 

• 

• 

1.8571 

• 

• 

• 

• 

• 

1.8576 

Y 

-2.287 

-2.287 

0 

-1.247 

-1.247 

Elememt  Level 
Elimination 
Method** *** 

Iteration 

Nodes 

1 

2 

2 

1 

2 

2 

2,3 

1 

2,3 

2 

2-4 

3 

3,4 

Lagrange  Multiplier  Penalty  Method** 
Method 


Iteration 


Nodes 


3,4 

1 

3-5 

2-5 

2,4,5 

2 

4,5 

4.5 
4,5,7 
3-5,7 

3-7 

3-6 

3.5.6 

2. 3. 5. 6 

2.5.6 

5.6 


2,5,6 


Penalty 

Parameter 


10E-5 

10E-2 

10E-1 


10E-5 

10E-2 

10E-1 


10E-5 

10E-2 

10E-1 


10E-5 

10E-2 

10E-1 


2-10 

2.4.5 

2.5.6 


*  The  entire  penalty  term  is  c  k  10*5,  see  eqn  (30). 

**  The  norm  of  the  gradient  was  ¥ess  than  or  equal  to  10E-8. 

***  Convergence  of  the  Newton  -  Raphson  soltuion  was  not  obtained  after  20 
iterations. 


Table  4.  Nodes  in  Contact  vs  Load  for  Elliptical  Contact 
Surface  (  a  *  0.75,  Figure  2.) 


Elememt  Level 
Elimination 
Method** 


Lagrange  Multiplier 
Method 


1 

2 

2- 

1 

2- 

Penalty  Method** 


Penalty 

Parameter 

10p* 

E-5 

E-2 

E-l 

E-5 

E-2 

E-l 

1 

E-5 

E-2 

E-l 

1 

E-5 

E-2 

E-l 

1 

E-5 

E-2 

E-l 

E-5 

E-2 

E-l 

1 

E-5 

E-2 

E-l 

1 

E-5 

E-2 

E-l 

E-5 

E-2 

E-l 

*  The  entire  penalty  term  is  c  k  10p,  see  eqn  (30). 

**  The  norm  of  the  gradient  was  ?ess  than  or  equal  to  10E-8. 

***  Convergence  of  the  Newton  -  Raphson  soltuion  was  not  obtained 
iterations. 
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*  The  entire  penalty  term  is  cRk  ^10,  see  e<ln  (30). 

**  The  norm  of  the  gradient  was  ?ess  than  or  equal  to  10E-8. 

***  Convergence  of  the  Newton  -  Raphson  soltuion  was  not  obtained  after  20 
iterations. 
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Figure  6.  Location  of  contact  surface  vs  load  for  the  elasuca 
bending  around  a  circle 
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Abstract.  Penetration  of  a  sharp  object  causes  large  concentrated  deformations  in  an 
elastomer  solid.  The  nonlinear  nearly  incompressible  elastic  stress  analysis  of  the  solid 
is  done  with  quadratic  triangular  elements  and  displacements  referring  to  an  immovable 
grid.  A  lower  order  triangular  mesh  for  a  linear  thermal  analysis  is  conveniently  layed  with 
vertices  at  the  displaced  nodes.  This  gives  rise  to  highly  irregular  grids  of  slender  elements 
near  the  point  of  maximum  penetration.  The  condition  of  the  global  thermal  (stiffness) 
matrix  is  estimated  in  terms  of  the  element  geometry.  It  is  concluded  that  no  significant 
decline  in  the  condition  of  the  matrix  takes  place  inspite  of  the  high  deformation. 

Introduction.  To  set  up  the  finite  element  stiffness  and  mass  matrices  for  plane  thermal 
analysis  we  need  to  evaluate 


/j  =  J^(ul  +  u\)dxdy  and  I2  =  u2dxdy 


(1) 


over  the  typical  triangular  element  A,  for  a  linearly  assumed  temperature  distribution  u. 
Consider  A  with  three  sides  l\ , /2, 13  and  area  A.  If  uf  =  (tii ,  u2,  tt3)  is  the  nodal  unknowns 
vector  for  A,  then  I\  and  /2  becomes  the  quadratic  form  I\  =  uffceue,  with 


(2) 


(3) 


The  matrices  ke  and  mt  are  said  to  be  the  element  stiffness  and  mass  matrices,  respectively. 

Assembly  of  ke  and  me  over  all  Nt  finite  elements  in  the  grid  produces  the  correspond¬ 
ing  global  matrices  K  and  M  in  the  manner 


utKu  =  53  vjkeue 
€ 


uT M u  -  53  uj mtut 
€ 


(4) 


where  u  is  the  global  vector  of  nodal  unknowns,  and  where  e  indicates  summation  over  all 
triangles. 

With  minimization  undertaken  under  the  constraints  of  the  boundary  conditions  our 
thermal  problem  is  such  that 


...  utKu 

min 


where  h  is  a  linear  measure  of  the  element,  and 


limMi(h)  =  Aj  >  0 
h— 0 


(5) 

(6) 


where  Ai  is  the  fundamental  eigenvalue  of  the  problem  describing  differentia]  operator.  For 
a  reasonably  fine  mesh  it  is  safe  to  assume  Aj  =  . 
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Actually,  for  a  sufficiently  fine  mesh  we  may  use  the  lumped  element  mass  matrix 


instead  of  (3). 

We  denote  by  Af  and  A$  the  smallest  (1st)  and  largest  (Nth)  eigenvalues  of  K.  With 
proper  boundary  conditions  K  is  positive  definite  and  we  want  to  estimate  its  spectral 
condition  number 

C2(K)  =  iij  (8) 

aN 

as  a  function  of  the  inherent  and  discretization  parameters  of  the  problem. 

Global  bounds.  From  Rayleigh’s  theorem  we  have  that 

Af  <  utKu  <  A$  (Q) 

Af  <  urMu  <  Aj(f 

if  uTu  =  1;  while  eq.  (5)  assures  us  that 

(10> 

for  an  arbitrary  vector  v  that  satisfies  the  essential  boundary  conditions. 

If  we  choose  u  in  eq.  (10)  so  that  uT  Ku  =  Af,  then  we  have  from  eq.  (9)  that 

Af>AjAf  (11) 

On  the  other  hand  if  we  start  with  uTKu/uTMu  =  Aj,  then  we  obtain 

Af  <  A,Ajv  (12) 


or  combined 

AjAf  <  Af  <  AjA;^  (13) 

The  usefulness  of  eq.  (13)  lies  in  the  fact  that  M  is  positive  definite  with  a  spectral 
condition  number  that  is  independent  of  h. 

The  bounds  in  (13)  are  most  critical  and  to  make  them  tightest  we  w7ant  A^f/Af*  as 
close  to  1  as  possible.  This  can  be  achieved  with  a  nonuniform  density  distribution,  and 

I2  =  f  pu2dxdy ,  f  pdxdy  =  1  ,  p{x,y)  >  0  {14) 

instead  of  (1).  A  variable  density  distribution  affects  Aj.  that  need  be  assessed  for  it. 
We  shall  not  pursue  this  matter  here  as  we  shall  see  it  not  very  essential  to  the  present 
situation. 
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Element  bounds.  On  matrices  that  are  not  diagonally  dominant,  and  the  high  order 
finite  element  and  finite  difference  matrices  are  such,  Gerschgorin’s  theorem  fails  at  the 
lower  end  of  the  eigenvalues  spectrum.  The  eigenvalue  bounds  for  the  globed  stiffness  matrix 
are  written  here  in  terms  of  the  eigenvalues  of  the  element  matrices.  We  shall  show  them 
sharp  and  most  convenient  in  the  finite  element  analysis. 

If  u  denotes  the  global  unknown  vector  and  ue  the  one  for  the  typical  eth  element, 
then 

uTu  <^2ujue<  uTu  pmai  (15) 

e 

where  Pma*  denotes  the  maximum  number  of  elements  that  share  a  common  node.  Six  is 
a  typical  value  for  Pmaz  in  plane  problems. 

To  write  bounds  on  A$  we  choose  a  normalized  u,  vTu  =  1,  and  such  that 

=  uT  Ku  =  ^2ujkeue  (16) 

e 

If  X„e  denotes  the  largest  eigenvalue  of  the  positive  semi  definite,  ke,  then  for  any  ue 

uT keue  <  Xn‘ uT ue  (17) 

and  eq.  (16)  yields  with  it 

A#  <  max(An*)  £  r£ut  <  pmai  max(A^)  (18) 

C  C 

e 

A  lower  bound  on  A$  is  obtained  from  Xft  >  uTKti,  uTu  —  1.  Choosing  u,  =  0  at  all 
nodes  except  for  ue  that  corresponds  to  the  maximum  eigenvalue  of  ke  produces  the  desired 
upper  bound,  and  we  have 

max(An')  <  A#  <  pma x(a£*) 

™  (19) 

max(A”')  <  A^y  <  prnoI(A?'') 

Since  the  element  stiffness  matrix  ke  is  usually  only  positive  semi  definite  the  bound 

A*  >  minfA*')  (20) 

where  A*'  is  the  lowest  eigenvalue  of  ke,  reduces  to  the  trivial  Xf  >  0.  But  the  element 
mass  matrix  mt  is  positive  definite,  AJ”'  >  0  for  all  e.  and 


A^  >  min(A71' ) 


is  useful. 


We  combine  eqs.  (13),  (19)  and  (21)  to  write 


Aj  min(A™')  <  Af1  <  X\P,naJ  niaxfAj?') 


Nearly  collapsed  triangles.  The  element  stiffness  matrix  ke  in  eq.  (2)  is  of  rank  two. 
For  «f  =  (1,1,1)  we  have  Uekeue  =  0.  To  write  the  two  nonzero  eigenvalues  of  kt  we 
introduce  the  notation  r,  =  /*/(2.4),  i  =  1,2.3,  and  have  that 


3 


i 


1*3 

W5®n 

i**  9rW,w 


AJ*3  =  ^(rl  +  r2  +  >*3  ±  -  r2)2  +  (r2  -  r3)2  +  (r3  -  r,)2) 


To  observe  what  happens  to  A*  when  elements  collapse  we  consider  the  triangle  in 
Fig.  1,  and  readily  compute  for  it 


_  .  ,2  .  1  2(1  -  cos  -y) 

2 A  =  r  sin  -y,  r,  =  r2  =  - -  ,  r3  =  — — : - - 

sm  'y  sm  -y 


If  -y  =  0,  A,'  =  -y  J,  and 


1  v  6 

—  ^  A  v  —  — 


while  if  -y  S  180°.  A*'  =  3(180  -  -y)"1,  and 


3  <  <  18 

180  —  -y  “  v  S  180  —  -y 


From  the  lumped  me  in  eq.  (7)  we  derive 


A^  >  ^  min(Ae) 


and  consequently 


niin(Ae) 


which  assures  us  that  A*  >  0  if  Aj  >  0.  But  with  a  careful  consideration  of  the  specific 
mesh  we  can  do  better  than  (28).  Consider  the  mesh  in  Fig.  2  that  includes  one  slender 
element  with  area  A\.  As  A\  — *  0  the  mass  of  this  element  reduces  to  zero  but  because 
it  shares  nodes  with  large  elements  A^  is  nearly  unaffected  by  a  small  A\.  Actually,  the 
smallest  mass  is  at  point  B. 

Equation  (22)  guarantees  that  under  the  circumstances  of  Fig.  2,  A*  is  not  likely  to 
change  much  with  A\.  We  shall  be  more  specific  about  that  in  the  next  section. 


Penetrated  elastomer.  Track  pads  that  run  over  sharp  objects  suffer  very  large  localized 
deformations.  Figure  3  shows  the  deformation  of,  an  originally  straight,  cylinder  ABCD 
made  of  rubberlike  material,  as  a  result  of  point  £  penetrating  the  body  along  axis  AC. 
Elastic  computation  is  done  with  quadratic  elements  and  a  nearly  incompressible  material. 

First  order  triangular  elements  are  judged  adequate  for  a  superposed  thermal  analysis 
of  the  solid  and  for  computational  convenience  the  new  mesh  is  drawn  with  vertices  fixed 
at  the  displaced  nodes.  The  resulting  triangular  grid  is  shown  in  Fig.  3.  Sharp  elements 
are  created  near  point  C  prompting  us  to  suspect  a  loss  of  conditioning.  Notice  that  the 
elastic  deformation  is  nearly  area  preserving  but  each  new  individual  triangle  need  not 
have  the  same  area. 
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It  is  the  theoretical  and  computational  conclusion  of  this  paper  that  the  large  defor¬ 
mations  and  slender  elements  observed  in  Fig.  3  have  only  a  marginal  influence  on  the 
condition  of  the  global  thermal  matrix. 

The  dangerously  collapsed  triangles  in  Fig.  3  are  with  a  small  angle  -y  and  we  have 
from  eq.  (25)  that 


^  Av  < 


6 


— : -  ^  Av  ^  — - 

sin  7  1  sm^f 


(29) 


and  A jy  is  nearly  proportional  to  7  1 . 

We  observe  that  the  smallest  mass  is  at  point  D,  while  the  biggest  mass  is  at  the 
interior  points.  Hence 


h2 


and  X#  =  h? 


(30) 


and  we  have  from  eq.  (13)  that 


h 2 


A,x  <  Af  <  A,/i2 


(31) 


Consequently 


A1/i2sin-7  “  Ci[K)  <  ^^2^^ 


(32) 


The  number  A!  is  not  known  exactly,  and  under  large  deformations  it  is  slightly  displace¬ 
ments  dependent.  To  have  an  idea  of  what  X\  can  be  we  recall  that  for  a  unit  square 
membrane  edge  fixed  Xx  =  2?r2.  But  even  without  a  numerical  value  for  Aj,  equation  (32) 
clearly  tells  us  how  the  spectral  condition  number  C2(K )  of  the  global  thermal  matrix  K 
depends  on  h  and  -y. 

In  the  mesh  of  Fig.  3  the  temperature  is  prescribed  along  AB  and  at  point  C .  Using 
conjugate  gradients  to  minimize  and  maximize  uTKu/uTu  we  compute 


A*  =  0.0285,  X%  =  7.82  .  C2(A')  =  274 


1 K  _  7 


for  the  undeformed  mesh,  and 

Af  =  0.0323  ,  A#  =  16.6  ,  C2(K)  =  514 


for  the  deformed  mesh.  In  agreement  with  the  theoretical  prediction  of  eq.  (31).  Af1  is 
nearly  independent  of  the  deformation. 

About  three  significant  digits  are  lost  in  the  thermal  finite  element  analysis  of  the 
deformed  body  in  Fig.  3,  a  wholly  tolerable  loss  on  computers  that  typically  carry  seven 
digits  in  single  precision  and  16  in  double. 
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ABSTRACT .  Many  solutions  have  been  reported  for  the  hydraulic  autofrettage 
process.  In  this  paper  a  simple  analysis  of  the  swage  autofrettage  process  is 
presented.  The  contact  pressure  at  different  locations  is  determined  as  a  func¬ 
tion  of  interference.  The  deformation  and  stress  distribution  during  autofret¬ 
tage  is  obtained.  At  the  end  of  the  autofrettage  process,  the  permanent  bore 
enlargement  and  residual  stresses  are  calculated.  Numerical  results  are  pre¬ 
sented  in  graphical  forms. 

I.  INTRODUCTION.  To  increase  the  maximum  pressure  a  cylinder  can  contain 
without  plastic  deformation  and  to  enhance  its  fatigue  life,  residual  stresses 
are  often  produced  in  cylinders  through  autofrettage  [1].  Many  solutions  have 
been  reported  for  the  hydraulic  autofrettage  process  [2-6].  The  thick-walled 
cylinders  were  subjected  to  uniform  internal  pressure  of  sufficient  magnitude  to 
cause  plastic  deformation  and  then  the  pressure  was  removed. 

A  more  economical  way  of  producing  residual  stresses  in  thick-walled  cylin¬ 
ders  is  the  swage  autofrettage  process.  This  process  is  carried  out  by  a  swage, 
the  diameter  of  which  is  greater  than  the  inner  diameter  of  the  cylinder.  This 
swage  is  driven  through  the  cylinder  from  one  end  to  the  other.  A  rigorous 
analysis  of  this  process  is  difficult.  In  this  paper  a  simple  analysis  of  the 
swage  autofrettage  process  is  reported.  The  swage  mandrel  and  the  cylinder  are 
made  of  tungsten  carbide  and  steel,  respectively.  A  two-dimensional  plane- 
strain  analysis  is  used  to  determine  the  contact  pressure  at  different  locations 
of  the  cylinder  as  a  function  of  interference.  The  deformation  and  stress 
distribution  during  autofrettage  are  obtained.  At  the  end  of  the  autofrettage 
process,  the  permanent  bore  enlargement  and  residual  stresses  are  calculated. 

II.  ELASTIC  SWAGING.  The  swage  mandrel  is  assumed  to  be  a  short  cylindri¬ 
cal  bar  driven  through  a  long  thick-walled  cylinder  from  one  end  to  the  other. 
The  diameter  of  the  mandrel  (2c)  is  a  constant,  but  the  inner  and  outer  diam¬ 
eters  (2a  and  2b)  of  the  tube  are  variables.  When  the  difference  between  c  and 
a  is  positive,  we  have  interference  I.  For  small  values  of  interference,  the 
stress  state  in  the  swaging  assembly  is  elastic.  The  stresses  and  displacement 
in  the  tube  are 

r:"aVb®  Lb*  +  r*J 


[(1+v)  pj  +  (l-v-2v*)  ~]  (1) 


ar 

°9 

u  P/E 
r  =  r^aVb1 
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and  in  the  mandrel 


where  E,  v,  and  Ej,  vj  are  the  material  constants  of  the  tube  and  mandrel, 
respectively.  At  the  interface,  ua  (tube)  -  ua  (mandrel)  =  I  by  the  com¬ 
patibility  requirement.  The  interference  pressure  (p)  is  a  function  of  the 
interference  (I)  given  by 


p  *  (1  -  ")/[<l+»)  +  (l-v-2v* )  +  (l-i^v^m  -  gi)E/Ei]  (3) 


For  sufficiently  large  values  of  the  interference,  the  stresses  in  the  tube 
reach  the  yield  limit.  Assuming  that  Tresca's  yield  condition  governs  the 
behavior  of  the  material,  the  tube  first  becomes  plastic  at  the  interference 
when  the  stresses  satisfy  <jq  -  ar  =  aQ,  where  aQ  is  the  initial  tensile  yield 
stress.  The  solution  for  the  critical  interference  pressure  to  cause  incipient 
plastic  deformation  is 

P*  =  X  a0  (l  -  a«/b«)  (4) 

and  it  follows  from  Eq.  (3)  that  the  interference  for  the  onset  of  plastic  flow 
is 

T*  -  |  C (1+v)  +  (1-V-2V*)  9i  *  (l-v1-2v1*)(l  -  gs)E/Ei]  (5) 


which  reduces  to  I*  =  (1-v2)  a  aQ/ E  for  the  special  case  (Ej  =  E,  vj  =  v). 

III.  SWAGING  BEYOND  THE  ELASTIC  LIMIT.  For  values  of  interference  larger 
than  that  given  by  Eq.  (5),  a  plastic  zone  forms  in  the  tube,  so  that  for  a  <  r 
<  p  the  tube  is  plastic,  while  for  p  <  r  <  b  the  tube  material  is  still  in  an 
elastic  state.  The  elastic-plastic  interface  radius  p  is  a  function  of  the 
interference  I. 


Vie  assume  that  the  steel  tube  is  elastically-ideally  plastic,  obeying  the 
Tresca's  yield  criterion  and  the  associated  flow  theory,  but  the  tungsten  car¬ 
bide  mandrel  is  elastic.  This  assumption  is  justified  because  the  strength 
ratio  of  tungsten  carbide  to  steel  is  about  three.  For  loading  beyond  the 
elastic  limit,  the  closed-form  solution  has  been  found  by  Koiter  [2].  The 
expressions  for  the  stresses  and  displacement  in  the  tube  are 


or/a0 
Oq/o  o 


*  *(?  1  ♦  j£)  -  log  e 
oz  p 


in  (a  <  r  <  p) 


(6) 


or/a0 

Qq/Oo 


in  (p  $  r  <  b) 


(T) 


(8 


E  U  Op  n2 

“  ;  =  (l-2y) (l+u)  --  +  (1-y*)  J- 

a0  r  a0  r2 

where  the  elastic-plastic  interface  (p)  is  related  to  the  internal  pressure  (p) 

by 

P/<70  =  *(1  “  P2/b*)  +  log(p/a)  (9 

For  swaging  beyond  the  elastic  limit,  the  compatibility  requires  ua  (tube)  -  ua 
(mandrel)  =  I  at  the  interface,  i.e., 


(l-»2)  ~  “  [(l-2v)(l+y)  -  (l-v^vi2)  |-  ] 

a2  <j0  1  1  Ei 


(10 


Equations  (9)  and  (10)  give  us  a  parametric  representation  of  relating  p  to  I 
through  the  parameter  p.  The  contact  pressure  at  different  locations  can  thus 
be  determined  as  a  function  of  the  interference  I. 


IV.  UNLOADING  ANALYSIS.  After  swaging,  the  permanent  bore  enlargement  and 
residual  stresses  can  be  calculated  by  an  unloading  analysis.  Let  a  double 
prime  denote  a  component  in  the  residual  state,  i.e.,  ct0"  =  ct0  +  <r0'.  Assuming 
elastic  unloading,  the  solution  is  given  by 


<V 

<V 


P  b2 

- - - r+  .  i ] 

b2/a2  -  1  1  r2  J 


(11 


E  u'/r  =  -  C(l-v)  +  (l+v)b2/r2 ]p/(b2/a2-l )  ( 12 | 

In  a  recent  paper  [6],  this  author  presented  a  more  rigorous  elastic- 
plastic  unloading  analysis  based  on  a  new  theoretical  model  considering  the 
Bauschinger  and  hardening  effects  during  unloading.  This  mode  is  a  very  good 
representation  for  the  material  behavior  of  the  high  strength  steel  used  in  gun 
barrels  [7].  Taking  into  account  the  Bauschinger  effect  (f)  and  the  strain- 
hardening  during  unloading  (m')»  we  have  obtained  a  closed-form  solution.  On 
unloading,  yielding  will  occur  for  a  ^  r  <  p'  with  p'  <  p.  The  stresses  in  the 
reverse  yielding  zone  (a  $  r  <  p ’ )  are  given  by 

o  r'/a0  =  P/<J0  -  3*02'  (l+f)(p'/a)2(l-a2/r2)  -  (l-02' )  (1+f  )1og(r/a)  (13] 

<Jd'/<70  ~  Or'/°o  ~  (l+f)[l  +  02 '  (P ' z/r2-l ) ]  (14] 

where 

01 '  =  (l-m')/[m'  +  ~  -  02 '  =  ■■Pi'/d-m')  (15] 
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The  stresses  in  the  elastic  zone  (p'  $  r  $  b)  are 
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^r'/CTc 


ffQ ' /°c 


=  *(l+f)[±  (p'/r) *  -  (p1/ b)«] 


The  displacement  for  the  entire  tube  (a  <  r  <  b)  is 


(Ea0)u'/r  =  (l-2v) (1+v) (ar' /oQ)  -  (l-v2 ) ( 1+f ) (p1 /r ) ! 


The  residual  stresses  and  displacement  are  found  by  addition 


=  <rr  +  Op' 


<Jq"  =  oq  +  og'  and  u"  =  u  +  u' 


V.  NUMERICAL  RESULTS  AND  DISCUSSION.  The  material  constants  used  in  the 
calculations  are  E  =  206.84  GPa,  v  =  0.3,  o0  =  1.29  GPa,  m'  =  0.3  for  the  high 
strength  steel  and  Ej  =  610.19  GPa,  =  0.258  for  the  tungsten  carbide  mandrel. 
The  radius  of  the  mandrel  is  a  constant  c  =  58.42  mm,  but  the  thickness  of  the 
tube  varies  along  the  axial  direction  with  the  inner  radius  (a)  increasing 
slightly  and  the  external  radius  (b)  tapering  more  rapidly.  The  values  of  a  and 
b  at  four  typical  sections  are  aj  =  56.96,  57.82,  57.99,  58.63  mm  and  bj  = 
157.50,  106.75,  83.00,  83.00  mm,  for  j  =  1,2, 3, 4,  respectively.  The 
corresponding  values  of  wall  ratio  are  bj/aj  *  2.765,  1.846,  1.431,  1.42  at  four 
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sections.  The  interference  during  swaging  ( I )  is  the  positive  difference 
between  c  and  a.  The  values  of  I  at  four  sections  are  Ij  =  1.46,  0.60,  0.43, 
-0.21  mm  for  j  =  1,2, 3, 4.  The  negative  value  of  I4  means  that  there  is  no  con¬ 
tact  between  the  mandrel  and  the  tube.  For  the  positive  values  of  interference, 
the  contact  pressure  and  the  stress  distribution  during  swaging  can  be  obtained 
using  the  methods  presented  in  Sections  II  and  III.  The  information  after 
swaging  can  be  obtained  by  the  unloading  analysis  presented  in  Section  IV. 


The  numerical  results  are  presented  in  terms  of  the  dimensionless  quan¬ 
tities  defined  by 


r  =  r/a  ,  p  =  P/aQ  ,  og  =  og/o0 


(E/o0)I/a 


u  =  (E/a0)u/a 


The  contact  pressure  (p)  and  hoop  stress  (<7g)  at  the  interface  are  presented  as 
functions  of  the  interference  (I)  in  Figures  1,  2,  and  3  for  wall  ratios  b/a  = 
2.765,  1.846,  1.431,  respectively.  The  results  for  swaging  within  and  beyond 
the  elastic  limit  are  included.  The  pressure  is  a  monotonous  increasing  func¬ 
tion  of  the  interference,  but  the  maximum  value  of  hoop  stress  occurs  at  the 
onset  of  plastic  flow  as  shown  in  the  dotted  curves.  Initial  yielding  occurs  at 
I*  =  0.774,  0.799,  0.830,  and  fully  plastic  flow  occurs  at  I**  =  6.638,  2.909, 
1.751  for  three  different  wall  ratios,  respectively.  The_actual  values  of 
interference  (I)  at  three  chosen  sections  are  Ij  =  4.10,  I2  =  1.66,  I3  =  1.19. 
These  values  indicate  that  the  swaging  is  partially  plastic  at  these  sections  in 
zones  1,  2,  and  3.  The  corresponding  locations  of  elastic-plastic  boundary  are 
given  by  p/a  =  2.2001,  1.4196,  1.19205,  and  the  amounts  of  overstrain  are  68, 
49.6,  and  44.6  percent,  respectively.  Also  shown  in  Figures  1,  2,  and  3  are  the 
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values  of  contact  pressure  (p  =  0.972,  0.555,  0.671)  and  the  hoop  stress  at  the 
interface  oq  =  1  -  p.  The  distributions  of  hoop  stresses  during  swaging  are 
shown  in  Figure  4  for  typical  sections  in  three  zones.  The  maximum  value  of 
hoop  stress  occurs  at  the  elastic-plastic  boundary.  The  information  for  the 
displacement  and  stresses  after  swaging  can  be  obtained  by  an  unloading  anal¬ 
ysis.  The  distributions  of  residual  hoop  stresses  are  shown  in  Figure  5  for  the 
chosen  sections  in  three  zones.  Elastic  unloading  analysis  is  justified  in  zone 
3,  but  reverse  yieldings  occur  in  zones  1  and  2  with  p'/a  =  1.305  and  1.014, 
respectively.  Finally  the  distributions  of  residual  displacements  (u  ")  at 
typical  sections  in  three  zones  are  presented  in  Figure  6.  Also  shown  in  this 
figure  are  the  experimental  data  at  the  bore.  The  agreement  between  the  calcu¬ 
lated  and  experimental  data  is  excellent  in  zone  1,  but  not  so  good  in  zones  2 
and  3.  This  suggests  that  a  more  refined  analysis  is  needed  for  sections  with 
smaller  wall  ratios.  An  investigation  based  on  the  finite  element  method  is 
being  made  and  the  results  will  be  reported  in  the  near  future. 
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Introduction 

Optimal  design  presents  an  extreme  case  of  non-smooth 
mechanics.  The  unknown  becomes  the  density  of  material,  and  in 
an  ordinary  design  the  density  takes  the  value  0  or  1.  It 
describes  a  shape  which  has  least  weight  subject  to  the 
constraints.  However  the  optimal  design  is  frequently  not  at 
all  ordinary.  It  is  given  by  the  "weak  limit"  of  a  sequence  of 
designs  in  which  the  density  oscillates  more  and  more  rapidly 
between  0  and  1.  In  other  words  the  average  density  can 
take  fractional  values,  and  no  ordinary  shape  achieves  the 
minimum  weight. 

Mathematically  this  is  an  instance  of  the  relaxation  of  a 
nonconvex  problem.  That  is  a  special  topic  in  the  calculus  of 
variations,  to  widen  the  class  of  admissible  functions  so  that 
the  problem  becomes  correctly  posed  and  its  minimum  is 
achieved.  To  the  given  nonconvex  problem  we  associate  a 
relaxed  problem  with  the  same  minimum.  The  solutions  of  the 
relaxed  problem  are  the  weak  limits  of  minimizing  sequences  in 
the  original  problem. 

Our  application  of  this  technique  is  to  a  question  of 
"optimal  bounds"  for  composite  materials.  Its  solution  has 
been  a  major  achievement  of  the  Tartar-Murat  method  of 


compensated  compactness .  That  method  also  searches  for 
functionals  that  are  weakly  lower  semicont inuous — which  is  the 
key  property  implied  by  convexity.  If  the  unknown  is  a  vector 
function  like  (Uj (x , y) , Ug (x , y) )  then  convexity  must  be 
replaced  by  quasi convexity:  this  is  crucial  below.  Our  goal  is 
to  give  a  variational  statement  of  the  problem  of  optimal 
bounds,  and  an  alternate  approach  to  its  solution — in  which  we 
find  the  relaxed  problem  and  solve  it. 

The  Design  of  a  Conductor 

How  should  a  fixed  number  of  resistors  be  arranged,  in 
order  to  maximize  the  current?  For  current  in  one  direction 
the  answer  is  easy:  They  go  in  parallel.  The  combined 
resistance  is  the  harmonic  mean  of  the  individual 
resistances — or  equivalently,  the  net  conductance  is  the  sum  of 
the  conductances.  The  problem  becomes  more  serious  if  we  are 
measuring  two  currents,  north-south  and  east-west.  In  that 
case  resistors  in  one  direction  contribute  little  or  nothing  to 
flow  in  the  other  direction.  An  optimal  two-way  design  is  not 
clear.  It  is  a  much  more  complicated  series-parallel 
connection,  and  the  rules  of  the  competition  become  important. 

We  propose  to  make  the  problem  continuous  rather  than 
discrete.  Instead  of  current  between  nodes,  we  measure  current 
across  a  unit  square.  In  that  square  we  place  conducting 
material — as  much  as  we  have,  in  the  best  orientation  we  can 
find.  If  the  area  of  conducting  material  is  A,-  it  leaves  an 
insulated  area  1  -  A  through  which  nothing  flows.  Then  we 
impose  a  unit  voltage  difference  between  the  left  and  right 
sides  of  the  square,  or  the  top  and  bottom,  and  measure  the 
current . 

For  flow  in  one  direction,  the  design  problem  is  easy.  By 


placing  the  conductor  in  a  strip  across  the  square,  or  in 
several  parallel  strips,  the  current  is  maximized.  Suppose  the 
specific  resistance  of  the  conductor  is  also  unity  (completing 
a  wanton  destruction  of  dimensional  arguments).  When  A  =  1 
and  the  square  is  full,  the  net  current  is  1.  As  the  area 
(and  strip  width)  A  goes  below  1,  the  horizontal  current 
remains  equal  to  A.  The  overall  resistance  is  1/A  in  that 
direction;  the  resistance  in  the  vertical  direction  is 
infinite . 

The  real  problem  is  to  design  a  single  conductor  to  carry 
flow  in  both  directions,  up  the  square  as  well  as  across. 

Those  measurements  are  done  separately.  A  voltage  difference 
between  x  =  0  and  x  =  1  produces  horizontal  current,  and 
between  y  =  0  and  y  =  1  it  produces  vertical  current.  A 
strip  that  does  well  for  one  does  badly  for  the  other.  The 
question  now  involves  a  function  of  two  variables: 


To  achieve  a  horizontal  current  C  <  1 
and  a  vertical  current  D  <  1,  what 
is  the  minimum  possible  conducting 
area  A  ? 

1 


For  C  =  j  and  D  =  0,  the  minimum  area  is  A  =  The 


conducting  material  is  in  horizontal  strips.  For  C  = 


and 


D  =  j,  the  natural  construction  is  to  use  both  horizontal  and 


vertical  strips.  If  their  widths  are  g.,  then  the  area  they 


cover  is 


A  =  j+  j-  ^.=  ^-  (after  subtracting  ^  for  overlap). 


In  general  the  area  occupied  by  a  two-strip  design  is 
C  +  D  -  CD.  This  "Red  Cross  design"  will  certainly  carry  the 
required  currents  (or  more),  with  voltage  differences  equal  to 
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1.  The  question  is  whether  the  area  A  =  C  +  D  -  CD  is 


minimal.  The  answer  is  no. 


The  simplest  design  (Fig.  1)  is  not  optimal.  In  fact  it 
carries  more  current  than  required;  we  have  considerably 
underestimated  its  conductance,  by  taking  it  to  be  C 
horizontally  and  D  vertically.  When  the  strips  have  width 


the  actual  currents  (in  each  direction  separately)  are 


1//7.  Heuristically ,  part  of  the  horizontal  current  makes  use 
of  the  vertical  strip.  The  computation  uses  Laplace’s  equation 
in  the  cross,  with  potentials  1  and  0  on  the  left  and  right 


sides.  This  design  can  achieve  currents  C  =  D  =  l/V~J  with 


area  (which  is  less  than  C  +  D  -  CD).  Nevertheless  the 

area  can  still  be  reduced. 

In  this  note  we  describe  one  possible  optimal  design  (it 
is  already  known).  More  precisely,  we  describe  a  sequence  of 
designs  whose  areas  approach  the  minimum  value  A.  That  value 
is  achieved  in  the  limit,  which  becomes  a  composite  material — a 
mixture  of  conductors  and  insulators  with  a  properly  chosen 
microstructure.  The  effective  conductances  of  this  limiting 
composite  can  be  computed,  and  they  are  C  and  D.  It  is  a 
straightforward  problem  in  homogenization,  except  that  the  goal 
is  not  the  usual  one — to  compute  effective  conductances  for  a 
given  microstructure.  Our  problem  is  optimization,  to  find  the 
best  composite. 

The  design  is  not  unique.  For  equal  values  of  C  and  D 
the  composite  will  be  isotropic,  and  an  optimal  design  was 
found  by  Hashin  and  Shtrikman  [1].  They  filled  the  square  with 
circular  disks,  each  consisting  of  a  conducting  ring  around  a 
smaller  insulated  disk.  With  the  right  ratio  of  radii,  and  a 
packing  by  infinitely  many  disks,  the  properties  are  optimal. 
The  extension  to  anisotropic  designs  (C  *  D)  was  found  by 
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Tartar  and  Murat  [2],  who  replaced  the  circles  by  ellipses. 

But  the  real  achievement  of  these  authors  was  not  the 
construction  of  optimal  designs;  it  was  the  proof  that  no  other 
design  could  be  better.  That  is  a  subtle  problem,  to  admit  all 
microstructures.  It  led  Tartar  and  Murat  to  develop  the  theory 
of  "compensated  compactness,"  a  systematic  approach  to  weak 
limits — when  functions  (or  designs)  can  oscillate  more  and  more 
rapidly,  and  only  certain  average  values  have  a  stable  meaning. 
That  theory  has  extremely  valuable  applications,  far  outside 
the  present  problem. 

Our  goal  is  to  contribute  one  more  proof  that  the  area  A, 
given  below,  is  actually  minimal.  It  is  based  squarely  on  [3], 
in  which  we  computed  the  minimum  value  of  a  specific  nonconvex 
functional.  That  nonconvexity  is  typical  of  optimal  design 
theory,  in  which  the  original  statement  is  a  "0-1 
problem" — there  is  a  conductor  or  an  insulator  in  each 
subregion.  Just  as  integer  programming  is  nonconvex  and 
difficult  in  comparison  with  linear  programming,  so  our 
continuous  problem  needs  to  be  relaxed  to  a  variational  problem 
with  reasonable  solutions  and  the  same  minimum.  Those 
reasonable  solutions  will  be  the  weak  limits,  or  averages,  of 
the  unreasonable  designs  which  appear  in  the  0-1  formulation. 
In  other  words,  we  allow  ourselves  to  construct  composite 
materials  out  of  the  original  materials,  and  this 
homogenization  process  gives  to  the  original  nonconvex  problem 
a  new  and  more  satisfactory  form.  In  the  case  of  one  current 
it  becomes  convex.  In  our  present  case  of  two  currents  it 
becomes  pol vconvex .  and  can  be  solved. 

The  construction  will  not  be  based  on  circles  or  ellipses. 
A  different  class  of  designs  was  developed  by  Lurie  and 
Cherkaev  [4],  who  stayed  with  strips  but  made  them  extremely 
thin.  In  the  limit  it  is  the  density  and  direction  of  the 
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strips  that  determines  everything — it  decides  the  conducting 
area  A  and  the  macroscopic  properties  of  the  composite.  With 
this  "strip  construction”  the  calculations  become  easier;  the 
next  section  nearly  returns  to  resistors  in  series  and 
parallel.  The  construction  is  also  a  realization  in  physical 
terms  of  the  mathematical  process  of  convexif ication--to  fill 
in  the  line  segments  between  any  pair  of  points,  and  then  to 
fill  in  line  segments  to  any  of  the  new  points,  and  so  on.  In 
the  case  of  two  currents  and  a  conductivity  matrix,  these 
become  line  segments  between  matrices  whose  difference  has  rank 
one.  That  is  algebraically  more  delicate,  and  in  the  present 
construction  it  produces  "strips  of  strips" — but  the  underlying 
idea  is  not  changed. 

One  contribuiton  of  this  note  is  to  give  a  fresh  statement 
of  the  variational  problem.  We  have  found  it  useful  to  ask  for 
the  minimum  area  A  as  a  function  of  C  and  D,  rather  than 
to  describe  all  the  conductivity  tensors  that  can  be  achieved 
with  prescribed  area  A.  (The  two  forms  are  equivalent;  it  is 
like  giving  a  function  A(C,D)  instead  of  its  level  sets.)  We 
will  not  study  the  worst  composites,  which  are  also  of  interest 
and  are  closely  related.  Finally  we  reemphasize  that  the 
construction  is  easier  to  discover  than  a  proof  of  its 
optimality,  but  nevertheless  several  proofs  have  been  given: 
Kohn  and  Milton  [5]  have  provided  a  comprehensive  analysis  of 
the  problem  of  optimal  bounds . 

The  principal  application  is  to  structural  problems — the 
weight  minimization  of  an  elastic  body  subject  to  constraints 


from  the  loads.  This  is  the  shape  optimization  pioneered  by 
Michell  and  Prager,  and  highly  developed  in  the  work  of  Rozvany 
[6,7].  Mathematically  it  rests  on  the  relaxation  of 
variational  problems.  Our  joint  paper  [3]  gives  the  underlying 


theory,  which  leads  to  a  systematic  procedure  for  computing  the 
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theory,  which  leads  to  a  systematic  procedure  for  computing  the 
relaxed  problem — in  which  homogenization  is  successful  and  the 
minimum  weight  is  attained.  The  design  problem  also  extends  to 
plates,  where  the  appearance  of  more  and  more  stiffeners  in  the 
numerical  solution  of  a  0-1  problem  led  Olhoff  and  Cheng  [3] 
to  discover  the  approach  to  a  composite.  The  plate  equation  is 
of  higher  order  than  our  electrical  conduction  problem,  but  we 
anticipate  that  the  strip  construction  still  leads  to  an 
optimal  design — and  that  the  theory  of  homogenization  (or 
relaxation,  or  polyconvexity)  will  yield  a  proof  that  this 
construction  is  optimal. 


We  go  back  to  the  Red  Cross  pattern,  in  order  to  improve 
on  it.  The  improvement  comes  by  making  it  easier  for  vertical 
current  to  use  the  horizontal  strip.  As  it  stands,  the  current 
has  to  make  a  long  excursion;  the  vertical  current  away  from 
the  main  vertical  strip  is  exponentially  small.  We  divide  that 


into  N 


vert  ice 


the  square  (Fig.  lb).  As  N  -»  w,  that  part  becomes  a 
composite — still  with  infinite  resistance  in  the  horizontal 
direction.  When  the  density  of  vertical  strips  is  E,  and  the 
height  of  those  strips  is  1-C  (as  before),  the  vertical 
resistance  of  the  composite  is  (1-C)/E.  Since  the  composite 
is  in  series  with  a  conducting  strip  of  resistance  C  (to 
vertical  flow)  the  effective  properties  are: 

1  —  C 

vertical  resistance  C  +  -g - 

horizontal  resistance 
conducting  area  A  =  C  +  E(l-C)  . 
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The  desired  value  of  vertical  resistance  is  1/0,  to  produce 
current  D  with  unit  voltage  drop.  Therefore 


or  E  = 


D-CD 


The  total  conducting  area  is 


D-CD 


A  =  c  +  (1-C)  = 


C+D-2CD 


This  is  the  optical  value  established  (but  differently 
expressed)  by  Tartar  and  Murat. 

For  small  currents  the  area  is  close  to  C  +  D;  the 
econoiy  froa  overlapping  use  is  saall.  However  the  first 
correction  tera  is  -2CD,  better  than  C  +  D  -  CD  froa  the 
cross  pattern.  For  large  currents  the  iaproveaent  increases. 
In  the  example  that  previously  filled  5/9  of  the  square,  we 
now  have: 


and  D  = 


require  only  the  area  A  = 


In  that  case  the  density  of  vertical  strips  is  B  =  1/4. 

Note  that  we  have  not  filled  the  square  with  a  single 
hoaogenized  composite.  That  is  easy  to  do,  keeping  the 
properties  optimal.  If  we  alternate  rapidly  between  M 
horizontal  strips  of  conductor  and  composite  (Fig.  2),  then  the 
properties  are  not  changed.  As  M  -»  <»  this  produces  a 
homogeneous  material  formed  from  "strips  of  strips."  This 
would  be  the  local  construction  in  each  small  square  of  a 
larger  design,  in  which  the  conductances  and  area  fraction  A 
may  vary  throughout  the  region.  That  global  problem  has  a 
straightforward  variational  statement,  after  the  local  solution 
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has  produced  the  "relaxed"  integrand.  In  other  words,  the 
present  construction  produces  a  family  of  optimal  composites  to 
be  called  on  for  a  globally  optimal  design  [3]. 

In  a  manufactured  design,  M  and  N  are  finite.  However 


M  =  N. 


That  choice 


would  homogenize  the  simple  Red  Cross  pattern,  without 
improving  it.  The  composite  of  vertical  strips  (N  -»  «©)  need 
not  be  completed  before  M  increases— we  can  allow  N  = 


— but  it  must  proceed  more  quickly.  Of  course  the  vertical 
and  horizontal  directions  could  be  reversed,  to  give  a 
different  design  that  is  equally  optimal  (and  elliptical 
inclusions  are  a  third  possibility).  It  is  not  known  how  to 
describe  all  composites  that  achieve  the  optimal  bounds.  Only 
the  bounds  themselves  are  known,  and  we  come  now  to  the  proof 
that  A  above  is  minimal. 


te  Variational  Problem 


Suppose  S  is  the  open  unit  square,  partly  insulated  and 
partly  conducting.  When  a  unit  voltage  is  applied  between 
x  =  0  and  x  =  1,  current  flows.  It  is  described  by  a  vector 


whose  divergence  is  zero  (there  are  no  sources  inside  the 
square).  Therefore  the  vector  has  the  form  ( du/dy, -du/dx) 
for  some  stream  function  u(x,y).  For  any  u  this  vector  has 


2  2 

divergence  a  u/axay  -  a  u/ayax  =  0.  It  gives  the  magnitude 


|vuj  and  also  the  direction  of  the  current  at  each  point.  In 
an  insulated  region,  the  magnitude  is  |vu|  =  0  and  the  stream 
function  is  constant.  At  the  boundary  of  such  a  region  the 
normal  derivative  from  both  sides  is  du/an  =  0.  At  the  lower 
boundary  of  the  square  we  impose  u  =  0,  and  at  the  upper 
boundary  u  =  C  --in  order  that  a  current  C  shall  flow  from 
left  to  right.  (The  increase  u(Q)  -  u(P)  in  the  stream 
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function  gives  the  flow  across  a  path  from  P  to  Q. )  Since 
the  conducting  material  has  unit  specific  resistance,  the  heat 


2  2 
loss — which  is  I  R  in  a  single  resistor — is  //|vu|  dxdy. 


That  equals  current  tines  voltage,  or  C  tines  1.  This 
current  is  to  be  achieved  in  the  snallest  possible  conducting 
area.  That  area  is  identified  by  the  condition  vu  #  0 
— current  is  flowing — and  the  problen  becones: 


Minimize  the  area  in  which  vu  *  0 ,  subject  to 


//  |vu|  dxdy  =  C,  u(x,0)  =  0,  u(x,l)  =  C 
S 


This  one-dinensional  problem  is  solved  by  a  horizontal 
conducting  strip  of  height  C.  The  strean  function  can  be 
u  =  y  for  y  <  C,  u  =  C  for  y  >  C.  Then  |vu|  =1  in  the 
strip  and  vu  =  0  elsewhere.  The  constraints  are  met,  and  the 
strip  area  C  is  nininal. 


Note:  The  constraint  //  |vu|  axdy  =  C  has  used  the  fact  that 

the  actual  current  nininizes  this  integral,  and  therefore 
satisfies  Laplace’s  equation  in  the  conducting  area.  The 
physical  argument  based  on  heat  loss  can  be  replaced  by  Green’s 
identity 


|vu|^  dxdy  = 


//  u(-u  -u  )  +  /  u  ds 
xx  yy  an 


On  the  right  side  the  only  nonzero  term  is  the  integral  of  u 
du/an  along  the  top  of  the  square,  where  u  =  C  and 


J  du/dn  ds  =  voltage  drop  =  1.  Therefore  //  |vu|  dxdy  =  C 


It  is  important  to  see  that  the  problem  above,  while  not 
difficult,  is  also  not  convex.  The  minimization  of  area  is 


iMBIS 


actually  the  minimization  of  //  l^u^gjdxdy,  where  the  symbol 
ljj  represents  a  characteristic  function  —  the  function  which 
equals  one  in  the  set  K  where  vu  #  0,  and  zero  outside.  It 
is  the  nonconvex  0-1  function  illustrated  by  Fig.  3a.  The 
value  zero  looks  isolated,  but  if  vu  =  0  in  a  large  set,  then 
the  integral  is  small.  The  goal  is  to  achieve  vu  =  0  as 
often  as  possible,  and  the  constraint  is  introduced  through  a 
Lagrange  multiplier  A:  the  functional  becomes 


L  ( u ,  A )  =  //  [l{vujt0}  +  A|vup]  dxdy  -  AC, 


It  is  this  integrand  F  =  1  +  A|vu|  ,  with  the  isolated  value 
F(0)  =  0,  which  is  illustrated  in  Fig.  3b.  It  needs  to  be 
relaxed. 

For  a  problem  in  which  the  unknown  is  a  scalar,  the 
relaxation  of  F  is  the  same  as  its  convex if icat ion .  We  may 
replace  F  by  the  largest  convex  function  that  satisfies 
Fc  <  F,  without  changing  the  minimum  value  of  the  integral. 

(The  minimizing  function  u*  may  be  changed  radically.  For 
the  original  L  it  may  not  have  existed. )  In  this  problem  F 


grows  linearly  with  | vu | ,  up  to  the  point  where  a|vu|  =  1 

and  F  is  ti 

c 

functional  is 


and  Fc  is  tangent  to  F.  Prior  to  that  point  the  convexified 


ic(u, A)  =  //  Fc  dxdy  -  AC  =  //  2A 


1/2 


vu I  dxdy  -  AC 


The  minimizing  u  ,  which  must  go  from  zero  at  y  =  0  to  C 

|  $ 

at  y  =  1,  can  be  taken  linear:  u  =  yC .  Then  |vu  |  =  C  and 


the  functional  is 
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The  maximum  over  A  occurs  at  A*  =  1,  and  yields  the  minimum 
area  subject  to  the  constraint:  optimal  area  =  C.  We  note 

that  A  |vu  |  =  C  <  1,  so  the  minimum  does  occur  in  the  range 

where  F  is  strictly  below  F.  This  is  the  "homogenized" 
o 

regime,  oscillating  between  insulator  and  conductor — between 

2 

F  =  0  and  F  =  1  +  A|vu|  — in  which  the  average  of  F  is  Fc. 

One  further  comment  on  this  easy  problem,  the  design  of  a 
one-way  conductor.  It  was  made  to  look  difficult  by 
relaxation!  A  simpler  optimizer  is  the  one  proposed  at  the 
start,  with  stream  function  u  =  y  for  y  <  C  and  u  =  C 

2 

elsewhere.  That  choice  leads  to  A|vu|  =1  in  one  strip  and 
vu  =  0  in  the  complementary  strip,  and  no  relaxation  occurs. 
Each  region  is  fully  conducting  or  fully  insulated;  the  0-1 
problem  attains  the  same  minimum  as  the  homogenized  problem. 

In  fact  the  homogenized  solution  is  the  one  suggested  by 
Fig.  2,  in  which  the  horizontal  conductor  is  split  into  M 
strips,  with  M  -»  «•.  The  result  is  a  composite  conductor 
(horizontal  only;  the  vertical  part  of  Fig.  3  is  not  present) 
through  which  the  current  is  uniform.  That  corresponds  to  our 

relaxed  solution  u*  =  yC  over  the  whole  square. 

Thus  the  one-way  problem  illustrates  relaxation  in  a  case 
where  it  is  not  needed.  The  minimum  area  C  is  also  attained 
in  the  unrelaxed  problem.  However  our  proof  that  this  jLs.  the 
minimum  used  convexif icat ion :  for  A  =  1  and  any  admissible 
u, 


{vu*0}  +  lvu|2]dxdy  ~  c  *  ff  -2  |  vu  |  dxdy 


C  >  C  . 


area  =  SI  [  1 


In  the  two-way  problem  relaxation  is  absolutely  needed — the 
original  has  no  solution — but  a  simplex  convexif ication  is  no 
longer  correct. 


The  Two-Way  Problem 

The  variational  statement  involves  two  stream  functions 
u(x,y)  and  v(x(y).  The  unknown  is  now  a  vector.  Its  first 
component  is  constrained  by  u(x,0)  =  0  and  u(x,l)  =  C  and 
2 

SI  |vu|  dxdy  =  C,  as  before.  The  second  component  v 
reflects  the  vertical  current  D,  which  is  required  to  flow 
when  a  unit  voltage  drop  is  imposed  between  the  bottom  and  top 
of  the  square.  In  the  region  where  both  currents  are  zero, 
vu  =  0  and  vv  =  0,  conducting  material  is  not  needed.  The 
conductor  occupies  the  set 

K  =  {vu  *  0}  U  {vv  *  0}  , 

whose  area  it  is  our  goal  to  minimize.  The  problem  becomes: 

2 

Minimize  area  (K)  =  //  1R  dxdy  subject  to  J7|vu|  dxdy  <  C  , 
JJ"  |vu  |2dxdy  <  D,  u(x,0)  =  0,  u(x,l)  =  C,v(0,y)  =  0,  v(l,y)  =  D 

The  strip  design  proposed  earlier  has  area 
A  =  (C+D-2CD ) / ( 1-CD ) .  We  now  show  that  this  is  minimal. 

The  problem  is  again  nonconvex  because  of  the  0-1 
characteristic  function  1^.  Introducing  the  constraints  by 
Lagrange  multipliers  A  and  the  unrelaxed  functional  is 

L(u,v,A  ,fj)  =  If  [  1^  +  A  |  vu  | 2  +  A*  J  vv  | 2  ]  dxdy  -  AC  -  pD  .  (1) 
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It  could  be  convexified,  but  the  minimum  of  L  is  too  low:  it 

c 


is  below  that  of 


The  correct  relaxation  is  the 


mas i convex if i cat ion  Lr  — the  largest  functional  below  L 


which  is  weakly  lower  semicontinuous  in  Hx.  Its  minimizing 


I  I 

functions  u  ,v  will  be  the  weak  limits  of  minimizing  (but 


highly  oscillatory)  sequences  for  L. 


The  difficulty  is  to  compute  the  relaxed  form  L  .  That 

r 


was  the  goal  of  our  paper  [3].  The  property  of  quasiconvexity 
is  difficult  to  verify,  but  in  several  important  examples  a 
stronger  property  holds  and  can  be  tested.  This  stronger 


property  of  the  relaxation  L  =  SI  Fr(vu,vv)  dxdy  is 


polyconvexity : 


is  a  convex  function  of  vu 


and  vu  and  the  Jacobian  determinant 


J  =  |vu  vv 


The  Jacobian  is  itself  nonconvex,  so  that  a  polyconvex  function 
need  not  be  convex.  It  will  be  the  upper  envelope  of  a  family 
of  multilinear  functions — linear  in  J  as  well  as  vu  and  vv 
— in  the  same  way  that  convex  functions  are  envelopes  of  linear 
functions  of  vu  and  vv.  In  this  problem  the  unrelaxed 
integrand  is 


F  =  _  9  _  2  ^ 

\1  +  |  vu  |  |  vv |  ot 


f  vu  =  vv  =  0 
therwise . 


The  notation  has  incorporated  A  and  (j  into  u  =  Aa/  u  and 


v  =  v.  We  note  that  the  Lagrange  multipliers  are 

nonnegative  because  the  constraints  are  inequalities — the 


designer  is  happy  if  the  conductor  offers  less  resistance  than 
specified  to  one  or  other  of  the  currents.  In  the  optimal 
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2  2 

design  we  expect  equality,  //  |vu|  =  C  and  //  |vv|  = 
The  relaxation  of  F  is  known  from  [3]: 


t  2p-2\7\ 

'  1+ |vu j2+ |vv|2 


if  p  <  1 
if  p  >  1 


where  p  =  (|vu|2  +  |vv|2  +  2|X|)^2  and  7  =  |vu  vv|.  We 
show  below  that  Fr  is  polvconvex  and  Fr  <  F.  We  need  not 
show  that  Fr  is  the  correct  relaxation,  although  it  is;  no 
quasiconvex  function  is  between  F  and  Fp.  That  fact  is  not 
required  for  our  specific  (and  self-contained)  problem.  We 
know  that  the  constraints  can  be  satisfied  in  a  conducting  area 
A  =  (C  +  D  -  2CD)/(1-CD),  and  our  only  task  is  to  prove  that 
the  area  cannot  be  smaller. 

Provided  Fr  is  polyconvex,  the  associated  variational 
problem  can  be  solved: 


u(x,0)  =  0,  u ( x ,  1 )  =  A1//2C 

Minimize  //  F  dxdy  subject  to  . 

r  v(0,y)  =  0,  v(  1 ,  y)  =  pi/<JD. 

The  constraints  are  satisfied  by  the  linear  functions 

u  =  A^^2Cy  and  v  =  Dx  . 

For  those  functions  the  Jacobian  is  constant,  and  the 
fundamental  condition  for  quasiconvexity  is  that  such  a 
candidate — if  it  satisfies  the  boundary  conditions,  and  is 
therefore  admissible — is  always  minimizing.  Therefore  the 
minimum  value  of  //  Fr  dxdy,  after  integration  of  a  constant 
over  the  unit  square,  is 


Remembering  the  final  terns  -AC  -  pD  in  the  Lagrangian  (1), 
we  are  left  with  a  maximization  over  A  and  p : 


A*  =  max  min  L 

=  max  2(A1/2C+p1/2D  -  A1/2p1/2CD)  -  AC  -  pD.  (2) 
A  ,P 

Differentiating  with  respect  to  A  and  p ,  the  Lagrange 
multipliers  are 


A 


1-D 


and 


1-C 

I -CD  ' 


Then  substituting  into  (2)  yields 


A*  =  (which  coincides  with  A)  .  (3) 

This  calculation  assumed  that  the  minimum  occurs  when  p  <  1. 
That  is  easily  verified.  In  fact  p  turns  out  to  equal  the 
density  of  conducting  material — and  in  the  end  p  =  A,  because 
the  density  has  this  constant  value  over  a  unit  square. 

To  repeat  the  main  line  of  the  argument:  The  area  of  the 
design  cannot  go  below  A',  because 

i)  F  <  F  and  thus  L  <  L  for  each  nonnegative  A 

r  r 

and  p 

ii)  Fr  is  polyconvex,  so  that  the  associated  functional 
attains  its  minimum 

iii)  that  constrained  minimum  is  A',  which  coincides 
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with  the  area  A  approached  by  the  strip  construction. 

The  minimum  of  the  relaxed  problem  is  attained  by  linear  stream 
$  ♦ 

functions  u  and  v  ,  corresponding  to  uniform  flow  through 

the  square — which  in  the  relaxed  problem  is  covered  by  a 

homogeneous  composite.  In  fact  Fr  was  computed  in  [3] 

precisely  by  applying  the  strip  construction.  Our  observation 

here  is  that  we  need  only  its  polyconvexity,  in  order  to  solve 

the  variational  problem  in  this  paper — and  that  this  problem  is 

a  restatement  of  the  optimal  bound  problem.  Thus  the  argument 

depends  on  establishing  that  something  is  lower 

semicontinuous ! — which  Tartar  and  Murat  did  in  another  way. 

The  proof  of  polyconvexity  could  display  the  multilinear 

functions  whose  envelope  is  F  ,  but  the  result  comes  more 

r 

neatly  as  follows.  Start  with  the  convex  function 


c(t) 


Then  consider  the  two  functions 


0<t<l 

t>l 


F+(vu,vv,J)  =  c(  [  |  vu  | 2  +  |vu|^  2  det  [vu  vv]]1//2)  +  2J  . 

For  either  sign,  the  quantity  Q  in  brackets  is  a  nonnegative 
quadratic  form  in  vu.vv,  and  therefore  a  sum  of  squares.  Its 
square  root  t  is  a  convex  function,  and  c(t)  is  convex  and 
increasing.  Therefore  the  composition  c(t(vu,vv))  is  convex. 

The  linear  terms  T  2J  leave  F+  convex,  as  functions  with  an 

extra  argument.  Then  because  F^  is  the  maximum  of  the  two 

functions  F.  when  J  is  identified  with  detfvu  vv],  F 
+  r 

must  be  polyconvex. 

2 

For  that  last  step,  note  that  c  =  1  +  t  for  large  t 
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and  the  functions 


2  2 

F+  becoae  1  +■  |vu|  +  |vv|  •  For  small  t 
the  comparison  between  F+  and  F_  rests  on  the  inequality 

2(r+s)^^^  -  s  >  2(r-s)^/^  +  s  . 

This  holds  for  r  >  s  >  0 ,  r  +  s  <  1 .  In  our  case 
2  2 

r  =  j  vu  j  +  |vv|  and  s  =  2jJ|.  Thus  the  maximizing  choice 

of  sign  is  the  one  for  which  +  det  [ vu  vv]  equals  the 

absolute  value  |J|.  With  that  choice  the  argument  t 

coincides  with  p  in  the  definition  of  F  ,  and  max  F^ 

r  + 

coincides  with  F  . 

r 

2 

Finally  F^  is  below  F  because  2t  is  below  1  +  t  . 

2  2 

The  difference  (1-t)  =  ( 1-p)  in  the  range  0  <  p  <  1  is 

the  saving  in  area  achieved  by  homogenization. 
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ON  A  REFINED  NONLINEAR  THEORY  OF 
LAMINATED  COMPOSITE  PLATES 


J.  N.  Reddy* 

Department  of  Engineering  Science  and  Mechanics 
Virginia  Polytechnic  Institute  and  State  University 
Blacksburg,  VA  24061 

Abstract.  This  paper  summarizes  the  results  of  research  on  the 
development  of  a  refined  shear  deformation  theory  of  plates  and  Its 
analytical  solutions  In  the  linear  case.  The  detailed  results  are 
reported  In  two  technical  papers,  which  will  appear  elsewhere.  A  third- 
order,  nonlinear  shear  deformation  plate  theory  that  accounts  for 
parabolic  distribution  of  transverse  shear  stresses  through  thickness 
and  moderate  rotation  terms  Is  presented.  The  Levy  type  analytical 
solutions  are  developed  for  the  linear  case. 

I.  INTRODUCTION.  The  advent  of  new  composite  materials  and  their 
Increasing  use  In  various  fields  of  advanced  technology  has  generated  a 
new  interest  in  the  development  and  solution  of  consistent  refined 
theories  of  anisotropic  composite  plates  and  shells.  This  Interest  Is 
due  to  the  fact  that  the  classical  plate  theory.  In  terms  of  Its  basic 
assumptions  (l.e.  the  Klrchhoff  hypothesis),  comes  In  conflict  with  real 
behavior  of  these  new  materials.  For  example.  In  contrast  to  the  basic 
assumption  of  infinite  rigidity  in  transverse  shear  In  the  classical 
plate  theory,  the  new  composite  materials  exhibit  a  finite  rigidity  In 
transverse  shear.  This  property  requires  the  Incorporation  of 
transverse  shear  deformation  effects. 

In  addition  to  other  shortcomings,  the  classical  plate  theory 
Involves  a  contradiction  between  the  number  of  boundary  conditions 
physically  required  to  be  fulfilled  on  a  free  boundary  and  the  number 
available  In  theory,  which  Is  to  be  consistent  with  the  order  of  the 
associated  governing  equations  (see  Stoker  [1)).  The  non-fulfillment  of 
boundary  conditions  on  the  bounding  surfaces  constitutes  another  feature 
of  the  classical  theory.  In  recent  years  attempts  were  made  to  refine 
the  classical  theory  by:  (1)  incorporating  transverse  shear  effects, 

(11)  removing  the  contradiction  which  concerns  the  number  of  boundary 
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conditions  to  be  prescribed  at  each  edge,  and  (111)  fulfilling  the 
boundary  conditions  on  the  bounding  surfaces  and.  In  the  case  of 
laminated  composite  plates  and  shells,  of  the  continuity  conditions  at 
the  Interfaces  between  the  contiguous  layers.  In  addition,  the  refined 
transverse  shear  deformation  theories  can  be  used  to  model  such 
anisotropic  plates  and  shells  whose  material  exhibits  high  degree  of 
anisotropy,  and  are  not  restricted  to  the  thinness  requirement  Implied 
by  the  classical  laminate  theory.  Another  feature  of  refined  laminate 
theories  concerns  the  adequate  Incorporation  of  the  dynamical  effects 
allowing  the  evaluation  of  the  lowest  and  higher  natural  frequencies. 

The  shear  deformation  theories  known  In  the  literature  can  be 
grouped  Into  two  classes:  (1)  stress-based  theories,  and  (2) 
displacement-based  theories.  The  first  stress-based  transverse  shear 
deformable  plate  theory  Is  due  to  Relssner  (2-4).  The  distribution 
across  the  thickness  of  the  transverse  normal  and  shear  stresses  is 
determined  through  Integration  over  the  thickness  of  the  equilibrium 
equations  of  the  3-D  elasticity  theory.  The  associated  field  equations 
and  boundary  conditions  expressed  In  terms  of  2-D  quantities  can  be 
determined  by  using  the  variational  principles  of  the  3-0  elasticity 
theory,  or  by  considering  the  moments  of  nth  order  of  the  basic 
equations  of  3-0  elasticity  theory.  Both  methods  allow  the  reduction  of 
the  3-0  problems  to  a  2-0  equivalent  one. 

The  pioneering  work  of  the  displacement-based  theories  Is  due  to 
Basset  (5).  Based  on  Basset's  representation  of  displacement  field, 
Hildebrand,  Relssner  and  Thomas  (6]  developed  a  variationally  consistent 
first  order  theory  for  shells.  The  field  equations  were  derived  using 
the  principle  of  minimum  total  potential  energy.  By  using  the  displace¬ 
ment  representation  of  Basset,  Mlndlin  (7)  extended  Hencky's  theory  [8) 
of  Isotropic  plates  to  the  dynamic  case.  The  shear  deformation  theory 
of  Hencky-Mindlin  Is  referred  as  the  first-order  transverse  shear 
deformation  theory.  Recently,  Reddy  [9-11 J  developed  a  variationally 
consistent  third-order  shear  deformation  theory  that  accounts  for 
parabolic  distribution  of  transverse  shear  stresses  through  thickness 


and  the  von  Karman  strains. 
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In  geometrically  nonlinear  theories  of  elastic  anisotropic  plates 
one  often  assumes  that  the  strains  and  rotations  about  the  normal  to  the 
midplane  are  Infinitesimal,  and  retains  the  products  and  squares  of  the 
derivatives  of  the  transverse  deflection  In  the  strain-displacement 
equations  (the  von  Karman  assumption).  The  von  Karman  nonlinear  theory 
does  not  account  for  moderate  rotation  terms  that  could  be  of  signi¬ 
ficance  in  the  analysis  (especially  In  stability  problems)  of  plates 
while  accounting  for  the  transverse  normal  and  shear  strains.  The  small 
strain  and  moderate  rotation  concept  was  used  In  the  classical  theory  of 
plates  and  shells  by  Sanders  [12],  Kolter  [13],  Reissner  [14]  and 
Pletraszklewicz  [15],  and  In  first-order  plate  and  shell  theories  by 
Naghdi  and  Vangsarnpigoon  [16],  and  Llbrescu  and  Schmidt  [17]. 

In  the  present  study,  the  third  shear  deformation  with  moderate 
rotation  terms  (see  Reddy  [18])  Is  reviewed,  and  analytical  solutions  of 
the  lineary  theory  (see  Khdelr,  Reddy  and  Llbrescu  [19])  are  discussed. 

II.  A  THIRD-ORDER  THEORY.  Consider  a  laminated  plate  composed  of 
N  orthotropic  layers,  symmetrically  located  with  respect  to  the  midplane 
of  the  laminate.  The  governing  equations  of  the  refined  theory  are 
based  on  the  following  displacement  field  [9-11]: 

Ul-u*Z[*x-f  (f)2(*x  +  S>l 

u3  =  w,  (1) 

where  (uj^.uj)  are  the  displacements  along  the  x,  y  and  z  coordinates 

respectively,  (u,v,w)  are  the  corresponding  displacements  of  a  point  on 

the  midplane  of  the  laminate,  and  and  i|>  are  the  rotations  of  a 

x  y 

transverse  normal  about  the  y-  and  x-axes,  respectively. 

The  cubic  variation  of  u^  and  U£  through  laminate  thickness 
Introduces  higher-order  resultants 
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(R1 ,R-)  ■  J  Z  (0r,O*3dZ| 


and  laminate  stiffnesses 


(F1  j*H1  j}  =  J"  h  Q^2  *z  >dz  (1 J  -  1.2,6) 


{D1  j'F  1  j)  3  J"  h  Qij(2  *z  )d2  (1J  »  4,5). 


For  symmetrical  cross-ply  laminated  plates,  the  following  stiffness 
coefficients  vanish  [91: 

8jj  *  E^j  »  0  for  1,J  *  1,2, 4, 5, 6 

A16  *  A26  *  °16  *  °26  *  F16  *  F26  *  H16  =  H26  =  0 
A45  “  °45  *  F45  “  °* 

This  Implies  that  the  effect  of  coupling  between  stretching  and  bending 
vanishes.  For  such  laminates  the  governing  equations  are  given  by  (see 
[HI): 


~+Hn(-  A)(^4X+  A)  +  F 


3h2  1  11  ax3  "HV  3h2  ax3  ax4 


4  a\ 

. ,y  +  h  ( _ — i  ( _ x_ 

3x^3y  2bd  3x^3y 


+  -4%)  +  F 


2  2^  +  F12  2  +  ^12^-  2^  2  +  2  2^  +  F22  ^ 
3x^3y^  u  ay^sx  u  3h*  ay^ax  axV  3yJ 


4  3\  a4w  •  3\  3\  4  3\ 

+  H22(-  +  +  2F66(— 2^“  +  -T-)  +  2H66(-  “¥ (“T" 

^  3tr  ay4  3y^  bt)  3xSy  ay^ax  bb  3h*  ay^ax 

33*..  o,4,  .  2  3^„  „  3*w  X, 


3  III  _  4  „  2  34,  .  3\l>  2 

+ _ iL  +  _?L.y  )  1  _  4_  m  + _ -)  +  F  (-  —)( — -  +  3— ) 

9  9  ?!'  9  9  +  /  +  "RRV  91  Mv  9' 


9  r  ?  ?/ 1  2  1  55'  2  sx 

ax^ay  ax^ay*  h*  33  ax*  3X 


55'  h2/v3x 


2  3il»  .  3ib  2  3Ui  2 

+  T^)  +  Fd/i(-  “)( tJ-  +  ^t)1  +  ^55(73^  + 


44'3y2  3y  1  44'  h2May 


II 

4 

•y* 

V. 

ft: 

f 

!:s‘ 

i 

# 

S|? 

S: 


* 

m 

$ 

to 

A 

1 

ft 

15 

rV- 


$ 

$ 

V’ 

*2,* 

IW 

Jk 

'ijl* 

1 


tV 

i 

I 

'A 


$ 

'*( 


>.s 

1 


*3 

•2* 

ft 

■ft 


i‘S 

$ 

I 


$ 


4  3*x  a2w  3*v  a2*  4  3<\, 

*  °55(-  7)(ir  +  +  A44(w*  +  *  °44<- 


+  ^"j)i  +  q  ■  0, 

ay 

2  ?  ?  9 

3  V  »  *„  4  3  A  /I  *  *u 

°11  ~F  +  °12  «5y  +  Fll{-  -^X-r  +  M>  +  f12(- 

li  ax4  14  axay  Ii  3h4  8X4  3x4  IZ  3hz  3X3y 

,3W  3\  3\  4  a2**  a2*.  ?a3w 

-».  .  n  / _ *  * _ J£\  *  p  / _ z_,  t _ *  j. _ JL  ^  law 


,3W  3\  4  a2<»  a2<»  o,3 

+ - 2^  +  °66^ — F"  +  Txiy)  +  F66^ - ?X — y  +  ^7^  +  — 

3X3y^  00  ay^  3X3y  00  3IT  ay‘  3x3y  3X3y2 

-  IW  ♦  S>  ♦***(- 3>(*x  A  IFuli1 

4  3\  a3u  3\  4  ®Z*v  a3w 

+  Hll(-  A)(— r  +  +  F12  777^  +  M*  -T)(«a5  +  -2Jt) 

11  3tr  ax^  ax3  “  3x3y  12  3h2  3x3y  3xay2 

2  2  2  2 
3\  9  *x  4  3  *x  3  >v  2a3w 

*  F«(a»y  +  IT-’  +  H«e<-  i?»7T +  iitf  *^7>i 

+  7  |D55(H  +  *x>  +  F55<-  7)(*x  *  I?'1  ’  °-  <» 

2  2  2  2  2 
3  *x  3  *v  4  3  *x  3  *v  a3w  3  "'y 

^  +  3*2  ’  +  F 66<_  7,(W  +  „2  +  2  ^  +  °«  *W 


7 

3~4>  4  3”di  3  -  3  di  3 

+  D 22  -/  +  Fl2(-  “?><ix5y  +  “V^  +  F22^-  -^>(-2*  +  ^ 
ay*  3h*  3X3y  ax*ay  **  3tr  ay*  ay3 

2  2 


-  'Vy  *  I?)  *  °44<-  7>(*,  +  I?'1  -  ^2  'F66<^  * 

4  s2*»  »2*v  ?a3u  *2»,  4  s2», 

*  H66<-  ^>^7  +  „2  +  -JZ>  *  F12  HTy  +  H12<-  jfllVSS 


4  3‘  *x  3  Vv  2a3w  3  ^x 

+  H66^ - 2X3x37  +  — +  — 2 — ^  +  F12  axTv 

&b  3hz  axay  3xz  3xz3y  12  axay 


+  ^T->  +  F22  *  H22(-  AK^  +  ^>1  *  h  1D44<I? 

axcay  CL  ay^  3h  ay  ay3  3y 

♦  V  *  F44<-  7><ly  *  V1  =0- 


WMmmM 


•-  •»»  * 


Here  q  denotes  distributed  transverse  load,  and  A^j,  O^j,  F^j,  H1 j  are 
the  plate  stiffnesses  defined  by 


(0i j*Fi j*Hi *  J"h/2  Qi j)(z2*z4*z6)dz  m  1*2*6) 


(A1j’°ij’F1j)  *  -1*  Q1  j)(1*z2*z4)d2  O.J  =  4’5) 


where  Q^'  denote  the  plane-stress  reduced  orthotropic  moduli  of  the  k- 
th  lamina.  The  boundary  conditions  of  the  refined  theory  are  of  the 
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The  stress  resultants  appearing  In  Eq.  (5)  can  be  expressed  In  terms  of 
the  generalized  displacements  (w,*>  ,*  )  as: 

X  y 

a*x  **v  4  3*x  a2w  4  3*w  ,2W 

M1  "  °11  a T  +  °12  ay*  +  Fll(_  ^)(iT  +  ^  +  F12  ('  ^)(iy*  +  ^ 

34>x  3*v  4  3*x  a2w  4  3*v  a2w 

M2  *  °12  iT  +  °22  ay*  +  F12(_  ^)(i3T  +  ^  +  F22(_  ^)(i/  +  ^2} 

34)  ail  <  34)  at  2 

M6  *  D66(iT  +  a/*  +  F66(_  ^2)(iT  +  if  +  2  Ixfy* 
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The  Levy  method  can  be  used  to  solve  Eqs.  (2)  for  rectangular  plates 
for  which  two  opposite  edges  are  simply  supported.  The  other  two  edges 
can  each  have  arbitrary  boundary  conditions.  Here  we  assume  that  the 
edges  parallel  to  the  y-axis  are  simply  supported,  and  the  origin  of  the 
coordinate  system  is  taken  as  shown  in  Fig.  1.  The  simply  supported 
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boundary  conditions  can  be  satisfied  by  trigonometric  functions  In  x. 
The  resulting  ordinary  differential  equations  In  y  can  be  solved  using 
the  state-space  concept  (see  [20]). 

Following  the  Levy  type  procedure,  we  assume  the  following 
representation  of  the  displacements  and  loading: 

09 

w(x,y)  »  z  W  (y)slnox 


*x(x,y)  =  2  Xm(y)cosax 
m=l 


*u(x,y)  -  *  Y  (y)slnax 


q(x,y)  =  z  Qn)(y)sinax, 
m=l 


where  a  =  —  and  Wm,  Xm,  Ym  and  Qm  denote  amplitudes  of  w,  4ix, 
and  q,  respectively.  Substituting  Eqs.  (7)  Into  Eqs.  (2),  we  obtain 

e,W' ' ' '  +  e,W"  +  e,Wm  +  e.X"  +  eKXm  +  ecY' ' '  +  e,Y'  +  Qm  *  0 
im  <:  m  o  m  4  m  bra  om  /m  m 


«  +  e9Wm  +  e10X^  +  ellXm  +  *12^  =  0 


e13W;"  +  +  e15X;  +  e16Y;  +  e17Ym  =  °’  <8> 

where  primes  on  the  variables  indicate  differentiation  with  respect  to  y, 
and 

el  =  "  (^2^H22 

e2  =  2^^2^a2^H12  +  2H66^  +  A44  "  ^2  °44  +  ^Zp44 
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e16  "  °22  *  F22  *  *^2)H22 

e17  =  q2|'°66  +  F66  '  +  ^2  D44  '  (^2>2f44  '  A44' 

Equations  (8)  can  be  written  as 


W* 1  "  =  c,W"  +  c,W  +  c,Xm  +  c.V  +  cO 
m  1  m  Z  m  3m  4m  om 


x;  *  c5Wm  *  ce“m  +  C7X«,  *  Vi 

y:-vi"  *  ciow; +  ciix; +  ci2v 


where 


,e4  .  e4e6e12  e6e7 


a 


The  linear  system  of  ordinary  differential  equations  (10)  with 
constant  coefficients  can  be  reduced  to  a  single  matrix  differential 
equation  using  the  state-space  concept  (see  [20]) 

x'  =  Ax  +  b.  (1 

This  can  be  done  by  introducing  the  variables 


•  vj 

4 

0*1 


Numerical  results  are  presented  for  symmetric  cross-ply  (0°/90o/0°) 
plates  subjected  to  uniformly  distributed  load  (q0),  as  shown  in  Fig.  1. 
The  following  material  properties  are  used  in  the  calculations: 

El  =  19.2  x  106  psi  ,  E2  =  1.56  x  106  psl 

G12  =  G13  =  °*82  x  1q6  Ps1  *  g23  =  °-523  x  1q6  Ps1  (W) 

v12  ~  G*2^ 

The  following  notation  has  been  used  throughout  the  figures: 

SS  -  simply  supported  at  y  =  -b/2  and  at  y  =  b/2. 

CC  -  clamped  at  y  =  -b/2  and  at  y  *  b/2. 

FF  -  free  at  y  *  -b/2  and  at  y  *  b/2. 

SC  -  simply  supported  at  y  =  -b/2  and  clamped  at  y  =  b/2. 

SF  -  simply  supported  at  y  =  -b/2  and  free  at  y  *  b/2. 

CF  -  clamped  at  y  =  -b/2  and  free  at  y  =  b/2.  (20) 

The  aspect  ratio,  a/b,  is  taken  to  be  4. 

To  show  the  effect  of  transverse  shear  strains  on  the  deflections 
plots  of  nondimenslonalized  center  deflection,  w  =  10^w(a/2,0)h^E2/(qoa4) , 
versus  side  to  thickness  ratio  of  various  plates  are  presented  in  Figs. 
2-4.  The  shear  deformation  effect  is  more  significant  in  cross-ply 
plates  than  in  orthotropic  plates.  Also,  the  first  order  shear 
deformation  theory  (FSOT)  over  predicts  deflections  relative  to  the 
higher  order  theory  (HSOT). 

Figures  5  and  6  contain  plots  of  the  transverse  shear  stress 
through  laminate  thickness  for  various  boundary  conditions.  The 
stresses  were  computed  using  lamina  constitutive  relations.  The 
transverse  shear  stresses  are  constant  and  parabolic  through  thickness 


of  each  lamina,  respectively,  for  the  first-  and  higher-order 

theories.  The  discontinuity  at  Interface  of  lamina  Is  due  to  the 

mismatch  of  the  material  properties.  When  the  stresses  (0,0,0) 

x  y  xy 

obtained  from  the  constitutive  equations  are  substituted  into  the 
equilibrium  equations  of  elasticity  and  Integrated  through  thickness  to 
determine  the  transverse  shear  stresses,  the  resulting  stresses  will  be 
continuous  through  the  thickness. 

III.  A  MODE RATE -ROTATION  THEORY.  The  theory  is  a  generalization 


of  the  classical  plate  theory,  the  first-order  shear  deformation  plate 
theory,  and  the  third-order  shear  deformation  theories  of  Reddy  [9- 
11].  The  theory  is  based  on  an  assumed  displacement  field  and  orders  of 
magnitudes  of  linear  strains  and  rotations. 

Points  of  a  three  dimensional  continuum  V  are  denoted  by  their 
orthogonal  curvilinear  coordinates  x  =  (x^,x2,x3).  Covariant  and 
contravariant  base  vectors  at  points  of  the  continuum  are  denoted 
by  g.j  and  g^,  respectively.  Latin  indices  are  assumed  to  have  values  1, 
2,  3,  and  the  Greek  Indices  have  values  1,  2.  The  laminated  plate 
continuum  in  the  undeformed  configuration  is  defined  by  the  Cartesian 
product  of  points  In  the  midplane  n  and  the  normal  [-  h/2,  h/2 ] : 

V  =  fl  x  [-*- 

where  h  denotes  the  constant  thickness  of  the  laminate.  Let  xa  denote 
the  curvilinear  Inplane  coordinates  and  x^  be  the  normal  to  fl.  The 
metric  tensor  components  of  Q  are  denoted  by 

9.S  '  3.  ■  Sa  •  9  *  g  •  S  •  9  *93,-1 


9  -  -i-  ,  g  •  gs  =  «8 
■a  _  a  *a  ■  a 
3X 


9,  =  n, 


where  r  is  the  position  vector  of  a  particle  (xa,x3)  at  time  t,  6s  is 

-  a 

the  Kronecker  delta,  and  n  is  the  unit  normal  to  the  boundary  of  a. 

The  displacement  vector  of  a  point  in  the  plate  at  time  t  is  of  the 


where  the  Elnesteln  summation  convention  on  repeated  subscripts  Is 
assumed.  The  covariant  components  of  the  Green-Lagrange  strain  tensor 
are  given  by 


!1J  ‘  4  <U1|J  *  uj|1  +um|1u”IJ> 


where  a  vertical  line  denotes  covariant  differentiation.  The  strain 


components  e-jj  can  be  expressed  In  terms  of  the  linearized  strains  e^j 


and  rotations  as 


1  m  1  /  m  itK  1  m  t  n  a\ 

‘Ij  *  eij  +  2  e»iej  +  1  +  Sri1  2  (24) 


where 


e1j  =  2  (u1|j  +  uj|i)  *  “Ij  =  2  (u1|j  "  uj|i)‘ 


Following  [171,  we  now  assume  that  the  strains  and 
rotations  are  of  the  following  magnitude: 


e1j  =  0(e2)  *  “as  =  O(02)  *  ua3  =  0(9)  *  8  <<:  !•  (26) 


Equation  (26)  Implies  that  the  strains  and  the  rotations  about  the 
normal  to  the  midplane  are  small,  and  that  the  rotations  of  a  normal  to 
the  midplane  are  moderate.  Such  assumptions  are  justified  In  view  of 
the  large  inplane  rigidity  and  transverse  flexibility  of  composite 
laminates. 


Neglecting  terms  of  order  (e  j  and  higher  in  the  strain 
displacement  equations  (24),  we  obtain 


_  1  /  3  3  \  1  3 

t  -  e  „  +  -x  (e,  +  e,„ u>  )  +  u-  oi 
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e  ,  =  e  ,  +  i  (e,  mil  +  e-,u>^)  +  i  w  w, 
a3  a3  2  '  Xa  3  33  a  2  Xa  3 


e33  =  e33  +  ex3u3  +  2  “xS^ 
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where  the  underlined  terms  are  of  order  (e-3). 

The  present  theory  Is  based  on  the  following  assumed  variation  of 
the  displacement  components  across  the  plate  thickness: 

u  (xB,x3,t)  =  u°(xB,t)  -  x3u?,  +  f(x3)u*(xB,t) 

a  a  j(ci  a 

u3(xB,x3,t)  =  u°(xB,t)  +  u°(xB,t),  (28) 

where  f  Is  a  specified  function  of  the  thickness  coordinate  x^.  Note 
that  the  transverse  deflection  Is  assumed  to  be  Independent  of  x^  and 
consists  of  two  parts,  one  due  to  bending  and  the  other  due  to 
transverse  shear.  The  particular  form  of  displacement  field  Is  assumed 
in  order  to  Include  the  displacement  fields  of  the  classical  plate 
theory  (set  u2  *  0  and  u*  =0),  the  first-order  shear  deformation  theory 
[set  u^  =  0  and  f(xJ)  =  xJ],  and  the  thlrd-order^shear  deformation  theory 
of  Reddy  [9]  [set  u3  =  0  and  f(x3)  *  x3[l  -  ^  (jp)2]],  among  others. 

For  the  displacement  field  in  Eq.  (28),  the  strains  for  the 
moderate  rotation  theory  become  [consistent  with  the  assumptions  in  Eq. 
(26)1, 


=  e°  + 
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33  =  e33 
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where  g  =  df/dx^,  and 
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(30) 
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The  dynamic  version  of  the  principle  of  virtual  displacements  Is 
used  to  derive  varlationally  consistent  equations  of  motion  associated 
with  the  displacement  field  In  Eq.  (28).  The  principle  can  be  stated, 
in  the  absence  of  body  forces  and  prescribed  tractions,  as 
T 


0  *  f  [/  (a^6e..)dV  +  f  q«u,dA  -  J  p(u.fiu.)dVjdt  (31) 

.  W  IJ  A  J  w  11 
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where  denote  the  contravarlant  components  of  the  symmetric  stress 


tensor,  q  *  q(xa)  Is  the  distributed  transverse  force  per  unit  area, 
and  p  Is  the  density  of  the  material  of  the  plate.  The  superposed  dot 
denotes  the  time  derivative,  u  =  au/3t.  We  introduce  the  couples  and 
Inertias, 
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(NaS,  Mae,  Pa8)  =  J  aa8(l,  x3,  f )dx3 

h 

’  1 


h 
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(Qa»  Q°.  R“,  R°.  Sa,  S°)  =  J  oa3(l,  g,  xJ,  xJg,  f,  fg)dx: 

h 

"  2 


(N-3,  N3,  N3)  -  J  oJJ(l,  g,  g^)dx' 
h 
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(32) 
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IQ  =  J  pdx3  ,  Ij  *  f  px3dx3  ,  ij  -  J  h  pfdx: 

ji  ji  -  -j 

-  o  *  1  t 


h  h  h 

-  l  o(x3)2dx3  ,  L  =  J2  ox3fdx3  ,  Ip  =  [  0f2dx3.  (33) 
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The  equations  of  motion  of  the  theory  are  obtained  by  substituting  Eq. 
(30)  for  the  strains  In  terms  of  the  displacements  (u^.u^.u^.u*) 

Into  Eq.  (31),  Integrating  by  parts  to  transfer  differentiation  from  the 


displacements  to  the  stress  resultants  and  couples,  collecting  the 
coefficients  of  the  various  virtual  displacements,  and  invoking  the 
fundamental  lemma  of  the  calculus  of  variations.  We  obtain  the 
following  six  equations: 
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where  the  underlined  terms  are  entirely  due  to  the  Inclusion  of  moderate 
rotations  (l.e.,  over  and  above  the  von  Karman  nonlinear  terms). 


Equations  (34)  can  be  specialized  to  the  three  different  theories 
discussed  earlier.  The  equations  are  summarized  below: 


(i)  Classical  Plate  Theory  (u°  =0,  u*  =  0) 
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(11)  First-Order  Shear  Deformation  Plate  Theory 


(u?  -  0,  f  -  x3) 


Na8|g  +  (Q V 


!)L  3  I_u°  +  I.u1 

a7  8  0  a  la 


Qal«  +  i  -  V3 


MaS|  -  q6(J  *  u°|.)  -  H3u'  +  Rfl|  u1  •  I.u0  +  I.u1. 
'8  '  a8  a  1 8 7  _ a  '8  a  la  2  a 


(111)  Third-Order  Shear  Deformation  Plate  Theon 


(U3  =  0,  f  =  xJ[l  -  |  (x3/h)2]) 
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Note  that  several  other  theories  can  be  obtained  from  Eq.  (35)  as 
special  cases.  Analytical  solutions  to  the  linear  version  of  the  third- 
order  theory  were  presented  In  Section  II. 
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transverse  shear  stress  through  the  thickness  of  cross-ply 


NUMERICAL  SOLUTION  OF  PARABOLIC  PROBLEMS 


IN  HIGH  DIMENSIONS 


Edward  Dean,  Roland  Glowinski,  Chin-Hsien  Li 
University  of  Houston 
Department  of  Mathematics 
4800  Calhoun  Road 
Houston,  TX  77006 


ABSTRACT.  The  main  goal  of  this  paper  is  to  discuss  the  numerical  solution  of 
mathematical  problems  of  parabolic  type  when  the  space  dimension  is  high  and/or 
the  number  of  discretization  points  is  quite  large.  In  such  cases,  we  can  take 
advantage  of  the  evolution  nature  of  the  problem  under  consideration  to  derive 
numerical  methods  quite  easy  to  implement  and  well  suited  to  vector  and/or  parallel 
computers.  Operator  splitting  methods  are  one  of  the  key  ingredients  of  such  a 
methodology.  We  shall  illustrate  the  methods  described  in  this  paper  by  solving 
the  time  dependent  Navier-Stokes  equations  for  incompressible  viscous  fluids,  a 
variational  problem  originating  from  the  physics  of  liquid  crystals,  and  finally 
advection-diffusion  problems  in  very  high  dimension  associated  to  the  solution  of 
the  Zakai  equation  in  stochastic  optimal  control. 

1.  GENERALITIES  AND  SYNOPSIS. 

Linear  and  nonlinear  Parabolic  Problems  for  Partial  Differential 
Operators  occur  in  many  branches  of  Natural  and  Engineering  Sciences.  One  of 
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the  main  goals  of  this  paper  is  to  discuss  the  numerical  solution  of  such  problems 
by  methods  taking  advantage  of  the  evolution  nature  of  the  problem  under 

T’V 

consideration,  and  also  well-suited  to  vector  and/or  parallel  computers. 

Operator  splitting  methods  are  definitely  a  key  to  the  solution  of  such 
problems  and  a  description  of  these  methods  will  be  given  in  Section  2.  In  Section  3, 
we  shall  consider  the  solution  of  the  Navier-Stokes  equations  for  unsteady 
incompressible  viscous  flows,  then  in  Section  4  the  solution  of  nonconvex 
variational  problems  originating  from  the  physics  of  liquid  crystals .  Finally  in 
Section  5,  we  shall  consider  time  dependent  advection  diffusion  problems 
whose  solution  is  an  important  part  of  some  solution  methods  for  those  complicated 
Zakai  equations  originating  from  Stochastic  Optimal  Control',  we  shall,  discuss 
there  various  solution  methods  using  first  and  second  order  upwinding  and  also 
the  modified  method  of  characteristics  when  the  diffusion  coefficients  are 
small. 

The  techniques  described  in  Sections  3,  4,  5  will  be  illustrated  by  numerical 
experiments. 

2.  DESCRIPTION  OF  SOME  BASIC  OPERATOR  SPLITTING  METHODS  FOR  TIME 

DEPENDENT  PROBLEMS. 

2.1.  GENERALITIES. 

Let  V  be  a  Banach  space;  we  consider  in  V  the  following  initial 
vahie  problem 

+  A(u)  =  °’  (2.1) 

u(0)  -  u0  ,  (2.2) 
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where,  in  (2.1)  ,  A  (•)  is  a  linear  or  nonlinear  operator. 

We  suppose  that  A  has  the  following  nontrivial  decomposition  property 

A  -  Aj  +  A2  (2.3) 

(by  nontrivial  we  mean  that  Aj  and  A2  are  “individually”  simpler  than  A). 

There  are  many  techniques  to  achieve  the  numerical  integration  of  the  initial 
value  problem  (2.1)  ,  (2.2)  by  taking  advantage  of  the  decomposition  property 
(2.3).  We  shall  describe  some  of  them  just  below  (more  techniques  are  described  in, 
e.g.,  [1]  ).  Before  giving  these  descriptions  let’s  introduce  some  helpful  notation. 

In  the  sequel  At  (>0)  will  be  a  time  discretization  step  and  u»+o(. 
denote  an  approximation  of  u  ((n+a)  At)  .  The  first  scheme  to  be  described  is 

the  Peaceman-Rachford  scheme  (cf.  Sec.  2.2)  and  then  what  we  call  a  9  -  scheme 
(cf.  Sec.  2.3). 

2.2.  THE  PEACEMAN-RACHFORD  SCHEME. 

The  principle  of  that  scheme,  introduced  in  [2],  is  quite  simple: 

Consider  the  time  interval  [  nAt,  (n+l)At  ]  .  and  suppose  that  un  is  known; 
introducing  the  mid-point  (n+i/2)At  we  integrate  (2.1)  over  [nAt,  (n+i/2)At]  by 
a  scheme  which  is  of  backward  Euler  type  for  Aj  (implicit  in  Ax)  and  of  the 
forward  Euler  type  for  A2  (explicit);  on  (  (n+l/2)At,  (n+l)At]  we  exchange  the 

roles  of  Aj  and  A2  .  The  above  program  is  definitely  realized  by  the 

/ 

following  scheme: 

u°  -  u0  ;  (2.4) 
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then  for  n^>0  ,  un  being  known,  we  compute  un+1/2  and  u7**1  by  solving 
successively 


un+l/2-  u* 
At/2 


u 


i+i  ^n+1/2 


At/2 


ki  (un+l/2)  +  A2(u")  =  0  , 

+  A1(uK+i/2)  +  A2  (un+l)  ==  0  . 


(2.5) 


(2.6) 


We  observe  that,  initialization  excepted,  A5  and  A2  play  a  symmetric  role  in 
the  above  scheme. 

To  study  some  of  the  basic  properties  of  scheme  (2.4)  -  (2.6)  ,  such  as 

accuracy  and  stability  ,  we  consider  the  particular  case  where 


(i)  V  =  rn  , 


(ii)  A  is  an  NXN  symmetric  and  positive  definite  matrix;  u0e  IRN  . 


In  such  a  case,  the  exact  solution  of  (2.1)  ,  (2.2)  is  known  and  is  given  by 
u(t)  =  e"tA  Uq  . 


Concerning  the  decomposition  of  A  we  decompose  it  as  follows: 


A  =  At  +  A2  ;  Aj  =  aA  ,  A2  =  f3A  ,  with  a+/3  =  1  ,  0<a,/3<l.  (2.8) 


Stability  Properties  of  Scheme  (2.4)  -  (2.6):  We  have  from  (2.5)  ,  (2.6)  ,  (2.8)  , 
un+1  -  (i  +  0  &  A)-1  (i  -  a.  ™  a)  (i  +  a  ™  A)'5  (l  -  0  ^  A)  un  (2.9) 
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Using  a  vector  basis  consisting  of  eigenvectors  of  A  >  we  have  from  (2.9) 


,«+t 


(l-<x  ^  K)  {1-0  ^  X.) 

_ 2 _ l _  ua 

(1+a  ^X4)  (1  +  /3  ^  X{) 


(2.10) 


where  X£ 

(>0,  V  i  =■  1  ,  .  .  .  N)  is  the 

ith 

eigenvalue  of  A 

;  we 

suppose 

that  <; 

X2  <1  .  .  .  <,\N  .  Consider  now 

the 

rational  function 

Ri 

defined 

by 

(1-  7?  x)  (1  -  ~  x) 

Ri(x)  = 

2  2 
(1+  |  x)  (1  +  §  x) 

(2.11) 

we  observe  that  |  R£(x)  |  <  1  for  ail  x>0  ,  implying,  in  that  simple  case,  the 
■unconditional  stability  of  scheme  (2.4)  -  (2.6)  .  Since 

lim  Rj(x)  =  1  ,  (2.12) 

x— H-oo 

we  observe  that  for  stiff  problems,  i.e.  problems,  such  that  XN/Xi  >>  1  , 
scheme  (2.4)  -  (2.6)  is  not  very  good  to  damp  simultaneously  the  components  of 
u"  associated  to  the  large  and  to  the  small  eigenvalues  of  A.  From  this 
observation,  we  can  expect  that  scheme  (2.4)  -  (2.6)  is  not  well  suited  to 
“capture”  the  steady  state  solutions  of  stiff  problems  (like  those  obtained  from  the 
discretization  of  partial  differential  equations);  this  has  been  confirmed  by 
numerical  experiments. 


Accuracy  Properties  of  Scheme  (2.4)  -  (2.6):  Since 


e'x  =  1 


x  +  \  +  x2  e(x) 


(2.13) 
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and,  from  (2,11)  , 


Rj  (x)  =  1  -  x  +  ^  +  x2  l?(x)  ,  (2.14) 

with  lim  e(x)  =  litn  ?7(x)  =  0  ,  we  have  that  scheme  (2,4)  -  (2.6)  is 

x— >o  x— to 

second  order  accurate  in  the  simple  case  considered  here.  We  observe  from 
(2.9)  ,  that  if  one  takes  a  -  (3  -  1/2  ,  then  the  two  linear  systems,  which  have  to 
be  solved  at  each  full  step  are,  in  fact,  associated  to  the  same  matrix  I  +  At  A/4. 


2.3  The  8  -  Scheme. 

In  order  to  construct  operator  splitting  methods  better  suited  than  scheme 
(2.4)  -  (2.6)  to  the  numerical  integration  of  stiff  initial  value  problems  (2.1)  ,  (2.2)  , 
we  introduce  first  0  e  (0,  .5)  and  then  associate  to  0  the  decomposition  of 
interval  [nAt  ,  (n+1)  At]  given  by 

[nAt  ,  (n+l)At]  -  [nAt  ,  (n+fl)At]  U  [(n+0)At,  (n+l-0)At]  U[(n+l-0)At,  (n+l)AtJ  . 


A  numerical  method  for  (2.1)  ,  (2.2)  taking  advantage  of  (2.3)  and  of  the  above 
splitting  of  [nAt  ,  (n+l)At]  is  defined  as  follows: 


u  =  u 


0  > 


(2.15) 


then  for  n)>0  ,  u"  being  known,  we  compute  un+e  ,  u”+1”s  and  u”+1 

by  solving  successively 


un+0-  uw 
0At 


+  (un+e)  +  A2  (u“)  -  0  , 


(2.16) 
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(2.17) 


u*+i-0_  u*+a 

(1-26)  At 


+  At  (uK+<J)  +  A2  (u"+I~°)  =  0  , 


un  +  l_  un+l-9 

0At 


+  At  (uB+1)  +  A2  (un+l~&)  -  0 


(2.18) 


Stability  and  Accuracy  Properties  of  Scheme  (2.15)  -  (2.18):  taking  the  same  model 


problem  as  in  Section  2.2,  we  have  (with  B'  =■  1  -  20) 


un+1  =  (I  +  cx0At  A)'1  (i  -  fid AtA)  (I  +  /Se'AtA)'1  (I  -  ct0'AtA) 


(I  +  cxOAtA)-1  (I  -  /30AtA)  uR. 


(2.19). 


which  implies 


u«+i  _  (1-/30 AtXJ2  (l-a8'lAtX<)  u„ 
(l+<x0At\;)2  (l+/?e'At\{)  Ui 


Consider  now  the  rational  function  R2  defined  by 


(l-/36x)2  (l-qg'x) 
(t+a0x)2  (l+jS9'x) 


(2.20) 


(2.21) 


since 


lim 

X— H-co 


|r2  (x)i 


/3/a 


(2.22) 


we  should  prescribe 


a)>/3 


(2.23) 


27-3 


to  have,  from  (2.19)  ,  (2.20)  ,  the  stability  of  scheme  (2.15)  -  (2.18)  for  the  large 
eigenvalues  of  A.  Concerning  the  accuracy  of  scheme  (2.15)  -  (2.18)  we  can 
show  that 

R2  (x)  =  1  -  x  +  5L-  |i  +  (fi2-a2)  (202-40+l)}  +  x2/?(x)  ,  (2.24) 

with  lim  T}(x)  =  0  .  It  follows  from  (2.24)  that  scheme  (2.215)  -  (2.18)  is 

x— to 

second  order  accurate  if  either 

a-/3  (-1/2  from  (2.8))  ,  (2.25) 

or 

9  -  1  -  1/^  -  .29289,  .  .  .  ;  (2.26) 

scheme  (2.15)  -  (2.18)  is  only  first  order  accurate  if  neither  (2.25)  nor 

(2.26)  holds.  If  one  takes  a  =  /3  =  1  /2  ,  it  follows  from  (2.20),  (2.21)  that 

scheme  (2.15)  -  (2.18)  is  unconditionally  stable  for  all  0e(O,  1/2)  ;  however, 
since  (from  (2.22)  )  we  have 

lim  |Ra(x)|  -  1  ,  (2.27) 

X— H-oo 

the  remark  stated  for  scheme  (2.4)  -  (2.6)  concerning  the  integration  of  stiff 

systems  still  holds.  In  general,  we  shall  choose  a  and  fi  in  order  to  have  the 

same  matrix  for  all  the  partial  steps  of  the  integration  procedure;  i.e.  ,  a.  ,  p  ,  0 
have  to  satisfy 

a0  =  (3(1-20)  ,  (2.28) 

which  implies 

a  =  (1-20)7(1-0)  ,  /?  =  0/(l-0)  .  (2.29) 

Combining  (2,23)  ,  (2.29)  we  obtain 
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0  <  0  ^  1  /3  . 


(2.30) 


For  0  =  1/3,  (2.29)  implies  a  =  0  =  1  /2  ;  the  resulting  scheme  is  just  a 

variant  of  scheme  (2.4)  -  (2.6) 

If  0  <0<  1/3  ,  and  if  a  and  0  are  given  by  (2.29)  ,  we  have  then 

lim  |R2(x)|  =  /3/a  =  9/(1 -20)  <  1  .  (2.31) 

X— H-oo 

Actually,  we  can  prove  that  0=  [0*,  1/3]  (with  0*  -  .087385580  ....)  and  a  , 
/3  given  by  (2.29)  imply  the  unconditional  stability  of  scheme  (2.15)  -  (2.18)  . 
Moreover,  if  0s  (0*  ,  1/3)  ,  property  (2.31)  makes  that  scheme  (2,15)  -  (2.18)  has 
good  asymptatic  properties  as  n— ►  4-  oo  and  for  example  is  well  suited  to  compute 
steady  state  solutions.  If  0  =  1-  1/^2  (resp.  0  =  1/4)  ,  we  have  a  =  2  -  ^  , 
£-^2-1,  /3/a  =  l/-\[i  (resp.  a  =  2/3  , /3=1 /3  ,  /3/a  =1/2). 

2.4.  Further  Comments  on  Operator  Splitting  Methods. 

Integration  schemes  related  to  (2.15)  -  (2.18)  have  been  discussed  in  [3]  (see 
also  [4]  -  [6])  .  Concerning  the  convergence  of  the  above  schemes,  the  convergence 
of  the  Peaceman-Rachford  scheme  (2.4)  -  (2.6)  has  been  proved  in  [7]  (see  also  [8]) 
under  quite  general  montonicity  assumptions  on  AI  and  A2  (in  fact  these 
operators  can  even  be  multivalued  ).  These  are  not  such  general  results  at  the 
moment  for  scheme  (2.15)  -  (2.18)  (see  however  the  discussion  in  [9]  ).  In  [10]  ,  one 
can  find  splitting  methods  derived  from  the  Lie-Trotter  formula  and  applicable  to 
situations  in  which  A  =  Aj  +  A2  +  A3  ;  these  methods  however  may  be  inaccurate 
for  steady  state  calculations;  indeed  splitting  methods  for  more  than  two  operators 
are  also  discussed  in,  e.g.,  [1]  ,  [11]  ,  [12]  . 
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To  conclude  Section  2,  we  would  like  to  describe  a  variation  of  scheme  (2.4)  - 
(2.6)  (due  to  Douglas  and  Rachford;  cf.  [13]  )  ;  in  some  occasions  it  seems  to  behave 
better  than  (2.4)  -  (2.6)  as  a  tool  to  capture  steady  state  solutions  of  systems  such 
as  (2.1)  ,  (2.2)  .  however,  as  a  method  for  the  numerical  integration  of  (2.1)  ,  (2.2) 
it  is  only  first  order  accurate.  In  addition  to  that,  although  more  robust  than 
scheme  (2.4)  -  (2.6)  ,  it  also  suffers  from  the  basic  drawback  of  not  being  well 
suited  to  the  numerical  integration  of  stiff  differential  systems. 

The  Douglas-Rachford  scheme  is  described  by 


u°  -  u0  ; 


(2.32) 


then  for  n'^0  ,  un  being  known,  we  compute  u7t+1  and  u714"1  as  the 
solutions  of 


Xn+l  u  , 

;u-  +  A,  (  uR+1)  +  A2(u")  =  0  , 


(2.33) 


n+1  n  j_  , 

u  A‘  J  +  A,  (u'*‘  +  A2  (u“+1)  -  0 


(2.34) 


The  convergence  of  scheme  (2.32)  -  (2.34)  is  proved  in  [7]  .  [8]  for  A,  ,  A; 
monotone  (possibly  multivalued)  operators. 


3.  APPLICATION  TO  THE  NAVIER-STOKES  EQUATIONS  FOR  INCOMPRESSIBLE 
VISCOUS  FLUIDS. 

3.1.  GENERALITIES.  SYNOPSIS. 

In  this  section,  we  shall  discuss  the  application  of  the  operator  splitting 
methods  described  in  Section  2  to  the  numerical  simulation  of  incompressible 
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■msccms  flows  modeled  by  the  Navier -Stokes  equations.  We  shall  only  give  here 
the  general  principle  of  such  numerical  treatment,  referring  for  more  details  to 
[14]  -  [13]  . 

Let  us  consider  a  Newtonian  incompressible  viscous  fluid.  If  f2  and  T  denote 

N 

the  flow  region  (OCR  ,  N  —  2,  3  in  practice)  and  its  boundary,  respectively,  then 
this  flow  is  governed  by  the  following  Navier-Stokes  equations 

3u 

-  /v Au  +■  (u.V)u  +  Vp  -  _f  in  U  ,  (3.1) 

V'_u  =  0  in  ( incompressibility  condition).  (3.2) 

In  (3.1)  ,  (3.2)  , 

(a)  V  =  {™}  ,  A  =  V*  =  ^2  >  x  -  (xjjli  the  generic  point  of 

N  dx>  ~  i=1  9Xi 

iR 

N 

(b)  u  =*  (uj  is  the  flow  velocity , 

~  i=<l 

(c)  p  is  the  pressure  , 

(d)  u  is  a  viscosity  parameter, 

(e)  £  is  a  density  of  external  forces. 


In  (3.1)  ,  (u'V)u  is  a  symbolic  notation  for  the  nonlinear  vector  term 


3u 

3x 


Boundary  and  initio U  conditions  have  to  be  added  to  (3,1)  ,  (3.2)  ;  here,  we  shall 
only  consider  Dirichlet  boundary  conditions  such  as 
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u  =  g  on  T 


(3.3) 


with,  from  the  incompressibility  condition  (3.2)  , 

* 

£:n  dr  -  0  ,  (3.4) 

r 

with  £  the  outward  unit  vector  normal  to  r  . 

Finally  we  shall  prescribe  as  initial  condition 

jMx,0)  «  jj0(x)  a.e.  on  Q  ,  with  V -u_0  =  0  .  (3.5) 

Boundary  conditions  more  complicated  than  (3.3)  are  discussed  in,  e.g.,  [141  ,  [18]. 

The  Navier-Stokes  equations  for  incompressible  viscous  fluids  have  been 
motivating  a  very  large  number  of  papers,  books,  reports,  symposia,  workshops,  etc. 
Mentioning  all  of  them  is  impossible  and  we  therefore  refer  to  the  references  in 
[14]  -  [18]  . 

The  difficulties  with  the  Navier-Stokes  equations  (even  for  flows  at  low 
Reynolds  numbers,  in  bounded  regions  Q  )  are 

(i)  the  nonlinear  term  (u*V)u  in  (3.1)  , 

(ii)  the  incompressihlity  condition  (3.2)  , 

(iii)  the  fact  that  their  solutions  are  vector-valued  functions  of 
x,  t  ,  whose  components  are  coupled  by  the  nonlinear  term 
(u -V)u  and  by  the  incompressibility  condition  V-ja  ~  0  . 

Using  the  operator  splitting  methods  of  Section  2  for  the  time  discretization  of 
the  Navier-Stokes  equations,  we  shall  be  able  to  decouple  those  difficulties 
associated  to  the  nonlinearity  and  the  incompressibility,  respectively. 


3.2.  Time  Discretization  by  Operator  Splitting  Methods. 

We  shall  concentrate  on  the  0-scheme  since,  from  our  numerical 
seems  to  be  the  one  giving  the  best  results.  We  have  then 


u  “  u  o  ; 


then  for  n>0  ,  starting  from  un  we  solve 


''ll”*9-  u’ 


r 


0At 

—  -  a 

V.u' 1+0 

-  0 

n+6 

17.  +  0 

u  - 

r* w 

g 

71+1-0 

15  +  0 

LI  “J 

a 

(l-20)At 

_  f  n+i 

-0  + 

rt+i—0 

LI 

= 

n  +  l 

17  +  1  -0 

JA  -  U 

n+B  .  r  7i+0 


-  /3vAun+1-0  +  (u"+1-0.V)u,+'-0 


on  T  , 


OAt 


et^Aun+1  +  Vp^1  -  f  ”+t  +  (5u Aun+1-0 


V.u 


■Z"H 


-  {u”+1"0-V)u 71+1-0  in  a  , 


0  in  Q 


r/  +  i  ji+i  x. 

u  =  g  on  T 


experiments,  it 


(3.6) 


(3.7) 


(3.S) 


(3.9) 
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3.3.  Some  Comments  and  Remarks  Concerning  Scheme  (3.6)  -  (3.9) 


Using  the  above  operator  splitting  method,  we  have  been  able  to  decouple 
' nonlinearity  and  incompressibility  in  the  Navier-Stokes  equations  (3.1)  ,  (3.2). 
We  shall  comment  in  the  following  sections  on  the  specific  treatment  of  the 
subproblems  encountered  at  each  step  of  algorithm  (3.6)  -  (3.9).  We  shall  only 
consider  the  case  where  the  subproblems  are  still  continuous  in  space  (since  the 
formalism  of  the  continuous  problems  is  much  simpler);  for  the  fully  discrete  case 
see  [14]  (with  6  =■  1/4)  and  [IS]  where  finite  element  approximations  oT  (3.1)  . 
(3.2)  are  discussed. 

We  observe  that  u5  ^  and  are  obtained  from  the  solution  of  linear 

problems  very  close  to  the  steady  Stokes  problem.. 

If  one  uses  scheme  (3.6)  -  (3.9)  ,  the  best  choice  for  ct  and  0  is  given  by 
(2.29).  With  such  a  choice,  many  computer  subprograms  can  be  used  for  both  the 
linear  and  nonlinear  subproblems,  resulting  therefore  in  a  quite  substantial  core 
memory  savings. 


3.4.  Solution  of  the  Nonlinear  Subproblem  (3.8) 

This  not  the  place  to  give  a  detailed  discussion  of  solution  methods  for  the 
nonlinear  subproblem  (3.8);  we  should  observe  however  that  it  belongs  to  the 
following  class  of  nonlinear  DiricMet  systems 


r 


(XU  -  £/A_U  +  (u-V)u,  “X 
u  =  g  on  r  , 


in  F2  , 


(3.10) 


where  a  and  u  are  two  positive  parameters  (with  cc  —  l /At  ,  here)  and 
where  X  and  g  are  two  given  functions  defined  on  £2  and  F  ,  respectively. 
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Several  solution  methods  for  (3.10)  are  discussed  in  [141  -  [IS]  ,  including 
Newton's  method 1  and  nonlinear  least  squares  conjugate  gradient  (see  also 
[19]  for  further  details).  In  the  case  of  the  nonlinear  least  squares  conjugate 
gradient  methods,  we  have  been  using  algorithms  preconditioned  by  discrete 
variants  of  the  elliptic  operator 

_v— *  cxv  -  j/Av  (3.11) 

with  homogeneous  Dirichlet  boundary  conditions.  In  the  case  of  flows  at  large 
Reynold  numbers  the  viscosity  parameter  u  is  usually  small;  moreover  the  fast 
dynamics  of  these  flows  require  a  small  At  implying  that  a.  is  a  large  number. 
From  these  facts,  the  discrete  forms  of  the  elliptic  operator  (3.11)  are  matrices 
whose  condition  number  is  small  implying  that  simple  solution  methods  such  as 
successive  over  relaxation  (S.O.R.)  and  nonpreconditioned  conjugate  gradient 
methods  will  have  a  very  fast  convergence  for  solving  the  linear  systems  associated 
to  those  matrices  approximating  operator  (3.11)  (relaxation  methods  are  particularly 
interesting  since  they  have  very  good  vectorization  and  parallelization  properties): 
indeed  acceleration  methods  such  as  multigrid  or  preconditioned  conjugate  gradient 
are  useless  for  those  specific  problems.  Similarly  the  iterative  solution  of  the 
discrete  variants  of  (3.10)  by  the  nonlinear  least  square  conjugate  gradient  methods 
described  in  [14]  -  [18]  is  quite  fast  and  obtained  in  3  to  4  iterations. 

3-5.  Solution  of  the  Stokes  Linear  Subproblems  (3.7)  ,  (3.9) 

At  each  full  step  of  algorithm  (3.6)  -  (3.9)  we  have  to  solve  two  linear 
problems  of  the  following  type 
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r aii  -  //Aii  +  Vp  =  X  in  Q  , 


in 


n  , 


ru  =  g  on  T  ( with 


g  -n  df  -  0)  , 


(3.12) 


^  r 

where  a  and  u  are  two  positive  parameters?,  and  where  X  and  g  are  two 
given  functions  defined  on  fi  and  V  ,  respectively.  We  recall  that  if  X  and 
g,  are  sufficiently  smooth,  then  problem  (3.12)  has  a  unique  solution  in 
Vg  x  (L2(Q)/R)  ,  With 


“  {vjv  «  (Hto))^  ,  v  -g  on  T) 


(3.13) 


p  2  L2(£2)/R  means  that  p  is  defined  only  to  winthin  an  arbitrary  constant. 

We  refer  to  [181  and  the  references  therein  for  the  discussion  of  iterative 
and  direct  methods  for  solving  problem  (3,12).  Our  favorite  method  is  at  the 
moment  a  conjugate  gradient  variant  discussed  in  [183  of  the  following 
algorithm  (introduced  in  [20]  ): 


p°  s  L2(S2)  given-, 

then  for  n^Q  ,  assuming  that  pn  is  known,  we  compute 
f aun  -  /sAun  -  X  -  Vp"  in  £2  , 


u,"  -  g  on  T  , 


r 

-A<f>  =  V-u”  in  Q  , 
h  dfn 


On 


=  0  on  r  , 


<pndx  -  0  , 


Q 


(3.14) 

un  and  pn+1  by 


(3.15) 


(3.16) 
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a,nd  with  p  >0  , 


pn+!  ~  p"  -  pC^V-u”  +  a*") 


(3.17) 


Concerning  the  convergence  of  algorithm  (3.14)  -  (3.17)  we  should  prove  by  a 
variant  of  the  techniques  discussed  in  [14,  Chapter  7]  the  following: 


PROPOSITION  3.1: 


P< 


N  i  + 


Suppose  that  we  have 

1 


(3. IS) 


with 


IIV0II  , 


c  =  sup 

<£eH-{0} 


I/2(Q)N 


IIA«JI!t2 


L  (Q) 


1  ,  H  =  .  H2(0)  ,  U  -  0  on  D 


Then,  for  all  p°e  L5(£2)  ,  ive  have 


lim  (un  ,  pn}  -  (u,p0}  in  (Hl(Q))N  X  iHU)  , 

n — H"00 


(3.20) 


where  {u,p0}  is  the  solution  of  the  Stokes  problem ,  (3.12)  such  that 


Po  dx 


n 


p°  dx  ■ 


(3.21) 


U 


In  fact ,  the  convergence  is  linear  since 


llu  u  II 


H‘(J2) 


N 


and  lip"  -  p0ll 


L2(S7) 


converge  to  zero  at.  least  as  fast  as  geometric  sequences  tohose  ratio  is  less 
than  one. 
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Using  the  conjugate  gradient  version  of  algorithm  (3.14)  -  (3.17)  described 
in,  e.g.  [18,  Section  4]  ,  the  analogue  of  p  is  adjusted  automatically  at  each 
iteration,  making  useless  the  calculation  of  c  ;  we  obtain  moreover  a  much  faster 

convergence. 

REMARK  3,1:  It  follows  from  [20]  -  [22]  that  if  we  assume  that  Q  is  a 

hypercube  of  R''1  and  that  we  have  periodic  boundary  conditions  in  (3.12)  , 
(3.15)  ,  (3.16)  ,  then  algorithm  (3.17)  converges  in  one  iteration,  for  each 
p°  o  L2(Q)  . 

REMARK  3.2:  The  remark  made  in  Section  3.4  concerning  the  solution  of  the  linear 
systems  associated  to  the  discrete  variants  of  the  elliptic  operator  (3.11)  still 
applies  to  (3.15).  Therefore,  to  solve  the  discrete  versions  of  (3.15)  we  shall  use 
successive  over-relation,  or  non-preconditioned  conjugate  gradient  methods.  In  fact, 
our  preferences  go  the  over-relaxation  methods  since  they  are  much  easier  to 
parallelize  and/or  vectorize.  Unfortunately  the  same  remark  does  not  apply  to  the 
Neumann  problem  (3.16);  for  the  discrete  variants  of  this  problem,  we  have  been 
using  a  Cholesky  factorization  taking  into  account  the  fact  that  the  matrix  is  of 
maximal  rank  minus  one,  and  also  the  fact  that  in  practice  the  pressure  is 
approximated  on  a  grid  twice  coarser  than  the  velocity  grid  (see,  again,  [14],  [18] 
for  more  details). 

3.6.  NUMERICAL  EXPERIMENTS 

Combining  the  numerical  methods  described  in  the  above  sections  with  the  finite 
element  approximations  discussed  in  [14,  Chapter  7]  and  [18,  Section  5]  we  have  been 
considering  the  following  test  problem  (corresponding  to  a  double  jet  in  a  square 
cavity). 
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Here  Q  =  (0,1)2  ,  u  =  1  /8000  ,  and  the  boundary  conditions  are  the 


following 

g(xj  ,  x2)  =  _0  if  Xj  -  0  or  1 


(3.22)] 


f  g(x1}i)  =  £  if  0^  X]  ^  1/3  ,  19/48  <;  X]  <;  29/48  , 


2/3  <'  X]  <:  1  , 


g(X],l)  -  -1024  ,  ^}(x,  -  1/3)  (19/48  - 

'  \J  2  4  2 


lex  <•  15  f  =  l  +  i  1 
3  ^  xi  s  48  1  3  16  j  ’ 


xi)  if 


(3.22  )2 


g(x„l)  -  1024 


M  J.1 
{{2  ’  0 


(2/3  -  X])  (xt  -  29/48) 


if 


o 

s  e 
3  S 


29 

48 


2  X  ^ 

3  16  '  ’ 


g  (xt,0)  =  0  if  i  ^  xj  ^  J|  , 


J  s(Xi,0)  -  -1024  {o,  i  }xl(jL  -  X])  if  0  £  X]  <;  i  ,  (3.22)3 

*,(*1.0)  -  -  1024  {0,  (1  -  xj  (Xl  -  j-|)  if  i|  iS  X,  i  1  , 

s. 

corresponding  to  injection  of  fluid  by  the  upper  apertures,  and  ejection  by  the 
two  lower  holes. 

From  (3.22)  ,  we  see  that  both  apertures  are  1/16  wide,  that  the  two  .jets’ 
inclinations  are  45°  ,  the  left  (resp.  right)  one  being  oriented  toward  the  left  (resp. 
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the  right)  wall.  We  cars  also  see  that  the  maximum  injection  velocity  is  one,  and 
that  the  fluid  is  ejected  from  the  cavity  by  two  holes,  located  in  the  lower  corners, 
whose  width  is  also  1/16.  Parbolic  profiles  of  velocity  have  been  assumed  at  all 
aperatures  and  holes. 

Finally,  we  assume  that  the  flow  is  initially  at  rest,  i.e, 

(x,0)  =£  in  fi  ,  .  (3.23) 


From  these  characteristics,  we  can  see  that  we  actually  need  two  Reynolds  numbers 
(at  least)  to  characterize  this  jet  problem;  indeed,  if  one  takes  the  dimension  of  the 
jet  apertures  as  characteristic  length,  we  clearly  have  Re  =  -  500  ,  but  if  we 

consider  the  length  of  the  cavity  as  another  characteristic  length  the  corresponding 
Re  is  now  S000;  actually  for  the  two  upper  corners  we  can  also  define  a  local 
Reynolds  number  of  8000/3  =  2666.66.  .  .  ,  since  1  ,/3  is  the  distance  of  the 
apertures  to  the  closest  corner  (and  corresponding  vertical  wall). 

Our  goal  with  these  numerical  experiments  is  to  simulate  the  bouncing  of  the 
jets  on  the  closest  vertical  wall  and  to  observe  the  development  of  the  vortex 
pattern  by  visualization  of  the  streamlines  (the  streamlines  have  been  obtained 
as  the  contour  lines  of  the  stream/ unction  Tjj  ,  the  solution  of  the  Laplace 
equation 

-  =  to  , 

completed  by  adequate  Dirichlet  boundary  conditions  (see  [18,  Section  6])  ,  with  the 
vorticity  to  defined  by 


3u2  3U|  . 

3xi  '  aT  ’ 


Following  [14,  Chapter  71  and  [18,  Section  51  the  velocity  has  been  approximated 
by  continuous  functions,  piecewise  linear  on  a  regular  triangulation  consisting 
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of  -2  X  (128)5  triangles;  the  pressure  has  also  been  approximated  by  continuous 
and  pieceunse  linear  functions,  but  this  time  on  a  triangulation  twice  coarser 
than  the  velocity  one.  The  total  number  of  unknowns  is  then  of  the  order  of  32000 
for  the  velocity  and  4000  for  the  pressure.  Concerning  the  time  step,  we  have 
taken  At  =  10’2  and  used  the  0-scheme  (3.6)  -  (3.9)  with  0  =  1-  l/~^2  and  a.  , 
/3  given  by  (2.29). 

We  have  shown  on  Figures  3.1  to  3.14  the  streamlines  corresponding  to  the 
computed  solution  at  t  =  0.01,  0.5,  1.0,  1.5,  2.0,  2.-5,  3.0,  3.5,  4.0,  4.5,  5.0,  9.5,  10.0, 
respectively.  From  those  calculations  we  have  been  also  able  to  follow  the 
evolution  in  time  of  the  kinetic  energy  ^  J  |u|2  dx  and  of  the  enstrophy 
-  I  |(d|2  dx  ;  the  corresponding  results,  together  with  further  comments  will  be 

2  a 

reported  elsewhere.  Numerical  simulations  done  with  smaller  At  give  back  the 
same  numerical  results. 

All  these  calculations  have  been  done  on  a  CRAY-XMP  201. 

4.  APPLICATION  TO  LIQUID  CRYSTAL  CALCULATIONS. 

We  follow  the  presentation  in  [9]  ,  [23]  . 

4.1.  Formulation  of  the  Problem. 

“Imbedding”  a  steady  state  problem  in  a  time  dependent  one  is  a  well 
known  method  to  solve  the  former  one.  A  perfect  illustration  is  given  in  this 
section  where  to  a  nonconvex  variational  problem  originating  from  the 
mathematical  theory  of  liq-uid  crystals  we  associate  a  nonlinear  parabolic 
“ equation ”  which  is  solved  by  the  operator  splitting  methods  described  in  the 
above  paragraphs. 

Let  Q  be  a  bounded  domain  of  ;  we  denote  by  F  the  boundary  of  T2 
and  we  suppose  that  T  is  sufficiently  smooth  (Lipschitz  continuous,  for  example). 
We  define  now 
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127,  Reynolds  number 


eynolds  number=500 


h’cq)  -  (h \nyf  , 


then,  with  =  {vj}^=1  e  , 

3 

J(v)  =  h  f  IVvpdx  {  =  i  v  f  |V v  !2dx)  ,  (4.1) 

“  12  -  12 

and  finally 

E  =  (v[vs  li'Cfi)  >  v,  ;=  S  on  T  ,  [v(x)|  -  1  a.c.}  (4.2;) 

3  ,  , 

(where  !jv  |  ■=  '  £  vf!'X  4  )  »  we  suppose  that  g  is  such  that  E  — '  C  . 

Remark  4.1,  Consider  ja  s  jR2  and  define  0a  as  the  restriction  to  12  of  the 

f  unction 


x  -  a 


We  clearly  have  l£a(_x)|  =  1  a.e.;  furthermore,  we  can  easily  prove  that  $ae  Hl(C) 
(even  if  a^e  Q), 

We  consider  now  the  following  minimization  problem: 

Find  _u  s  E  such  that  J(u)  <  J(v)  for  all  ve  E.  (4.3) 

Using  the  fact  that  E  is  weakly  closed  in  Hl(12)  ,  we  can  easily  prove  that 
problem  (4.3)  has  at  least  one  solution;  further  mathematical  properties  of  (4.3)  are 
discussed  in  [24]  ,  [25],  Problem  (4.3)  is  associated  to  the  mathematical  modeling  of 
interesting  physical  phenomena  (as  discussed  in  the  Section  t  of  [25]  )  ,  some  of 
them  occurring  in  the  physics  of  liquid  crystals  (see  [26]  -  [28]  for  further 
information  on  liquid  crystals). 


4.2.  Numerical  Solution  of  Problem  (4.3). 

At  first  glance,  problem  (4.3)  seems  to  be  a  nontrivial  problem  of  the 
Calculus  of  Variations,  In  fact,  the  solution  of  (4.3)  is  quite  easy  to  achieve  by 
the  operator  splitting  methods  of  Section  2.  This  follows  indeed  from  the  fact 
that  problem  (4,3)  is  equivalent  to 
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Find  _u  -  such  that 


J(u)  +  I  Cu)  ^  J(v)  +  I  (v)  for  all  vs  , 


/here  (with  L?(£2)  =  (L2(S1))3  ) 


ri;.  -  {v|v  a  h'(«) ,  v 


on  r}  , 


Z  -  {v|v  S  L\Q)  ,  |v(x)|  -  1  a.e.  } 


and  where  I  :  "L  ‘  (fi)  — »  iRU  {+<■»}  is  defined  by 


I  (v)  - 


0  if  _v  e  i 


+  oo  tf  v  ?  y' 


Using  the  notation  of  the  above  section,  we  have  for  (4.4)  the  following  Eldar 
Lagrange  “ equation " 


Au  +  31  (u)  =  0  in  n  , 


u  =  g  in  F  , 


where  31^.  (ij)  is  the  “gradient”  of  Iv  at  _u.  We  associate  next  to  the 
nonlinear  elliptic  equation  (4.5)  the  nonlinear  parabolic  problem 


-  Au  +91  (u)  =  0  in  O  , 


u  =  g  on  r  , 


_u  (0)  "  u0  . 


Concentrating  on  the  0-scheme  (2.15)  -  (2,1  S)  (since  it  appears  as  the  most 
efficient  method  here)  we  obtain  the  following  algorithm: 
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’Ll0  =:  u0  ,  given  in  H J,  ; 


(4.7) 


then  for  n>0  ,  u  ^  being  known,  we  compute  u  uri+1-1^,  u  1  as  follows 


'  All”  +  al2  fe"+e>  -  S.  ■ 


(4.S) 


r 


.  n+1-0  n+0 

2i  .ii  a  , ,  n  +  1-0  ,  or  n+0\  n 

~T~roW  '  +  3  s  &  }  “  5. 


u.rc+l-0  -  g  on  F  , 


(4.9) 


un+l-  un+l_0 
~  -  9^t - Au;T,-B  +  3I2  (u-5  -  0 


(4.10) 


When  using  algorithm  (4.7)  -  (4.10)  for  practical  calculations  one  has  to  give  a  sense 
to  the  two  multivalued,  equations  (4.8)  and  (4.10).  The  interpretation  given  to 
(4.8)  is 


.un+ff  e  £  ;  uT+0  minimizes  over  ]T;  the  functional 


v.  — ►  =;  Jq!v,|‘  dx  -  J^Uin  +  OAtAu  ")-\/  dx. 


(4.11) 


The  solution  of  problem  (4.11)  is  clearly  given  by 


n+Q  _  .  u.  *  +  OAlAu^ 
iun  +  0AtAu~T 


(4.12) 


Similarly,  the  solution  of  (4.10)  is  given  by 
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u 


.  73-H™  1 


u,Ifl'0+  eAtAun+1'0 


ju!,tl-0+  0AtAun+1-0i 


(4.13) 


Once  u,n+1  is  known,  we  obtain  31  (u  '*9)  from  (4.2)  and  we  use  that 
information  in  (4.9)  to  compute  +I  0  via  the  solution  of  a  Dirichlei  problem 

for  the  elliptic  operator 

_v  — >v  -  (1  -  20)AtAvj. 

From  these  observations,  the  only  costly  step  of  algorithm  (4.7)  -  (4.10)  is  the 

Dirichlet  problem  (4.9);  in  fact,  since  in  practice  At  has  to  be  small,  the  discrete 
variants  of  the  above  elliptic  operator  are  well  conditioned  matrices  for  which 
relaxation  (and  over -relaxation)  methods  are  very  efficient  (see  [9]  ,  [23]  for 
more  details). 

4.3.  NUMERICAL  EXPERIMENTS. 

The  numerical  techniques  decribed  in  Section  4.2  have  been  applied  to  the 
solution  of  various  test  problems  in  [9]  ,  [23]  (see  also  [20]  for  related  numerical 
experiments).  In  this  paper  we  shall  only  consider  the  test  problem  for  which 
£2  =  CO,  l)9  (4.14) 

and 

JL"£a!r  ’  (4.15)! 


IjT  -  a  [  ’  ~  {'5,  ,5,  -S} 


(4. 15), 


it  follows  from  [25]  that  if  is  defined  by  (4.15)  ,  then  problem  (4.3)  has 

'unique  solution  which  is  precisely  given  by 
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(4.16) 


Ji  “  £a  \a  ■ 

From  the  simplicity  of  Q  ,  it  is  quite  convenient  to  approximate  problem  (4.3)  by 
a  finite  difference  method  such  as  the  one  described  below. 

Lot  N  be  a  positive  integer;  we  define  a  space  discretization  step  h  by  h 
=  1  /  N+l  and  then  the  discrete  set 

^  i(  u  k  ^  N+i  ,  with  MiJk  -  (ill,  jh,  kh}  . 

With  v*  -  {(vbfc},=1}g  ^  .:"M+1  >  we  approximate  J(v)  by 


Finally,  problem  (4.3)  is  approximated  by: 
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(4.19) 


Find  \ihe  Eh  such  that  J.JuJ  <;  JA(v;i)  for  all  v^ E;>  . 

Applying  the  O-scheme  discussed  in  Section  4.2  is  quite  easy  since  the  finite 
dimensional  problem  (4.19)  has  the  same  structure  as  (4.3), 

All  the  calculations  have  been  initialized  by  u?  ,  the  finite 
difference  approximation  of  the  solution  of  the  Dirichlet  problem 

r 

-Au  0  =■  0^  in  Vl  , 

J  (4.20) 

u!J  =  jj  on  T  . 

,4s  convergence  criteria,  we  have  used  (with  obvious  notation) 

(h3  £  lu^t1  -  ufc|a),/a 

- -  —  <  10"'1  .  (4.21) 

av3  £  KtU2)l/3 

i  <;N 

Since  the  exact  solution  of  problem  (4.3)  is  known  here  (and  is  given  by  (4.15)  , 
(4.26))  we  can  accurately  estimate  the  Lz(f2)-norin  of  the  approximation  error;  we 
have  chosen  as  estimator  of  the  L'?(!T2)-error  the  quantity  ph  defined  by 

Pn  =  (b3  £  |u(MiJfc)  -  u1JJk|2),/2  (4.22) 

1  ,j,k  <;N 

{we  took  here  jj(l/2,  1/2,  1/2)  “  {l/4£  l/’fs  ,  1  /  fz  }  )  . 

Using  a  discrete  variant  of  algorithm  (4,7)  -  (4.10)  in  which  the  elliptic 

problems  (4.9)  and  (4.10)  are  solved  by  the  Gauss-Seidel  method  we  need  8  time 
steps  to  reach  a  steady  state  if  one  takes  h  =  1/20,  0  ™  .1  and  At  =  1/200: 
the  corresponding  CPU  time  is  limn  ISs  on  a  VAX  11/780.  We  have  then 
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Ph  -  0.39  A  10  When  using  the  Gauss-Seidel  method  we  have  initialized  the 

calculation  of  uh  by  t)  and  then  the  calculation  of  by  , 

taking  as  stopping  criterion  one  similar  to  (4.21)  but  with  a  •=  ICf6  .  In  Table  4.1 
we  show,  at  each  time  step,  the  number  of  Gauss-Seidel  iterations  necessary  to 
converge  according  to  the  above  test. 


Step 

G.  S.  iterations 

1 

26 

2 

4 

3 

2 

4  to  8 

1 

TABLE  4.1 

Variation  of  the,  number  of  Gauss-Seidel  iterations  with  the,  time  step. 

If  instead  of  the  Gauss-Seidel  method,  we  use  an  over -relax  tion  one  with  the 
optimal  paramter  OJ  the  performance  of  our  computational  method  is  dramatically 
improved  (particularly  for  the  first  time  step),  and  for  Ihe  above  lest  problem  (with 
the  same  values  of  Q,  h,  and  At)  we  have  convergence  in  5  time  steps  (instead  of  8), 
the  CPU  time  being  reduced  to  3mn  44s  (instead  of  limn  18s);  actually,  the  L2-crror 
is  also  substantially  reduced  since  we  have  now  ph  =  0.23  X  10'!  (instead  of  0.39 
X  KT1). 

Fable  4.2,  just  below,  shows  the  variations  of  the  L2-error  ph  as  a  function 


of  h  ;  these  results  have  been  obtained  using  over-relaxation  instead  of  the 
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Gauss,  Seidel  method.  The  case  h  1/40  has  been  computed  on  the  CRAY-XMP 

201  ,  taking  approximately  12  seconds  (the  discrete  problem  (4.19)  involves  then 
1.8  X  10s  unknowns,  approximately  ). 


h 

At 

0 

ph 

1/10 

1/100 

1/10 

0.74  x  10'1 

1/20 

1/200 

1/10 

0.23  x  10"1 

1/40 

1/2000 

1/5 

0.12  x  10"1 

TABLE  4.2 

Variation  of  the  L2~error  with  h. 


The  results  in  Table  4.2  suggest  that  the  ^-approximation  error  is  in  0(h)  at 
best;  the  analysis  of  such  an  error  is  an  interesting  problem  in  itself. 

4A ■  FURTHER  comments. 

Relaxation  methods  for  solving  problem  (4.19)  are  discussed  in  [301  ;  they 

appear  to  be  quite  efficient,  one  of  the  reasons  for  such  an  efficiency  being  the 
fact  that  the  quadratic  constraint 


iS  3  fairly  Si'nPle  °"e  Since  “  does  involve  derivatives  of  v.  Suppose  now  that 
instead  of  (4.23)  we  have  to  deal  with 
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det  (1^  +  V_v  (x)) 


1  a.e. 


(4,24) 


(nonlinear  constraints  such  as  (4.24)  occur  in  incompressible  finite. :  elasticity ;  cf. 
[9]  and  the  references  therein).  Relaxation  methods  cannot  be  applied  any  longer,  at 
least  directly.  On  the  contrary,  operator  splitting  techniques  like  those  discussed 
in  Section  2  still  apply;  see  [9]  for  more  details,  further  comments  and  numerical 
results  concerning  the  treatment  of  nonlinear  constraints  such  as  (4,24). 

5.  NUMERICAL  SOLUTION  OF  ADVECTION-DIFFUSION  PROBLEMS  IN 
HIGH-DIMENSION. 

5.1.  MOTIVATION.  SYNOPSIS. 

Let’s  consider  the  following  stochastic  ordinary  differential  equation 

dX  -  V(x)  dt  +  dW  (5.1) 

where  X_  is  an  N-dirnensional  vector  and  W  a  noise.  Assuming  convenient 
hypotheses  on  the  noise  W  ,  the  probability  of  finding  at  time  t  the  state 
vector  X  ,  in  a  neighborhood  of  x  r  8RN  of  measure  dx  “  dx;  ,  .  .  .  .  dxN  , 
is  p(x,t)  dx  where  the  probability  density  p  satisfies  a  parabolic  equation. 
In  the  particular  case  where  V  is  divergence  free  ,  i.e. 

V-V  =  0  (5.2) 

and  for  simple  noise  models,  this  parabolic  equation  reduces  to 

If  -  E  V-p  +  V.Vp  =  0  .  (5.3) 
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An  interesting  (and  difficult)  case  is  the  one  where  the  level  of  noise  is  ’’weak” 


implying  that  s  is  “small”  ;  we  have  then  an  advection  dominated  advection- 
dif fusion  equation. 

The  solution  of  such  equations  plays  a  fundamental  role  in  the  implementation 
of  some  solution  methods  of  the  Zaka/i  equation  occurring  in  Stochastic  Optimal 
Control. 

In  the  sequel  wc  shall  consider  the  following  initial  bound-ary  value  problem 
|f  -  E  5T‘P  +  V-V p  -  f  in  O  x(0,T)  ,  (5.4), 

P  =  g  on  x  (0,T)  ,  (5.4).. 

p(x,0)  =  p0(x)  in  n  ,  (5.4).. 

with  fiC  !RN  . 

hor  solving  such  problems,  the  numerical  analyst  has  to  face  two  outstanding 
difficulties,  namely 

(i)  When  r.  is  small,  the  problem  is  advection  dominated, 

(ii)  For  practical  problems,  we  usually  have  N>3  . 

Fn  the  following  sections,  we  shall  describe  for  N  =  2  to  6  the  solution  of  (5.4) 
by  various  upwinding  methods  and  by  the  modified  (i.e.  backward)  method 
of  character  istics.  Numerical  results  will  bo  presented  for  the  particular  case 
where  Q  =  {0,1  )*\ 

5.2.  SOLUTION  OF  PROBLEM  (5.4)  BY  FINITE  DIFFERENCES  AND 
UPWTNDTNG  METHODS. 

We  consider  for  simplicity  the  case  where  Q  ----  (0,1  )N  with  S---2  ;  the 

extension  to  N>2  is  straightforward.  With  I  a  positive  integer  .  we 
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define 


h  by 


h~i  /i+i 


and  consider  over  O  -  QU3I2 


the  discretization 


points 


M;i  =  {ih  .  jh}  ;  0<ri.j  <;  1+1 


(5.5) 


At  the  points  Miu:  interior  to  £2  (i.e.  ,  !  i,.j  £  I)  we  approximate  (5.4)  by 


the  following  finite  difference,  scheme  (with  V  =-  {V:  ,  VJ): 


PiJ  ~  PiJ 

At 


„«+l  ,  n"H  ,  _?i+l  ,  «+l  ,t_R+! 

Pl+l.?+  Pi;+i  +  Pu-l-  ^Pf.) 


n+l  ?x4*i  n71^1 

+  VpM„)  V'<  ~  ?‘-‘J  -  v;  (M,j)  P|+»  P»- 


_  n  +  1  rt+i  ii""*'1  - 

+  vpMu)  Sa-iJStL. .  v;  (mu)  V,J 


h 


(5.6), 


-  f(M(J  ,  (n+1)  At)  , 


with  in  (5.6),  : 


(i) 

At 

(ii) 

Pu 

(iii) 

a+ 

(iv) 

Pit 

(v) 

Pw 

p(Mu  ,  nAt)  ; 
a+  =  max  (0,a)  ,  a-  ™  max  (0,  -a)  ,  V  aeiR  ; 


Dk.i  =  Po 


□ 


Scheme  (5.6)  is  of  the  backward  Euler  type  for  the  time  discretization 
and  of  the  first  order  upwind  type  for  the  space  discretization.  Probabitists 
favor  the  finite  difference  scheme  (5.6)  because  it  satisfied  a  discrete. 
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maximum  principle  and  therefore  possesses  a  probabilistic  interpretation , 
Unfortunately  the  above  scheme  is  only  first  order  accurate ,  quite  dissipative 
and  not  well-suited  for  those  situations  where  e  is  small  and  V  has  fast 
variations  over  the  space  domain  fi  . 

An  interesting  alternative  to  (5.6)  is  obtained  through  a  space/time 
discretization  which  is  second  order  and  also  of  the  upwind  type  (however,  it  does 
not  satisfy  the  discrete  maximum  principle).  Such  a  scheme  is  obtained  as  follows: 


r 

Pkt  -  Po  (Mfct)  , 


< 

pit  is  obtained  ( for  example)  via  (5.6)  ; 


(5.7) 


i 


then  for  n;>l  and  2  <;i,j<;  1-1  discretize  (5.4)x  by 


3  n+i 
o  P* J 


7>. 

lhj 


nn_1 

PtJ 


„S+! 

pi+n 


t+i 

Pi“U 


'/l  +  l 

P;,J+! 


_n+l 
Pi  J- 1 


-  4P? 


'.+  1 


At 


5..n+i  o,,'1'1'5  i  1  „« 

+  vf(Mu)  Pi 


n+1 
2  J 


+  VjTMi,) 


3_  riH-  1 
oPfJ 


+i  ,  i„»i+i 


+  oPt+2j 


< 


+ 


_  + 

V2  (Mu 


3  7!  +1 
pPi  J 


V'  +  1 


♦  1 


0»+l 

Pi}-?. 


+  v;(Mu) 


-nn  +  l 


'PlJ+i  +  p 


p  +  t 

Pfj+? 


(5.7), 


^  ~  fiMu,  (n+1)  At  )  . 


If 


or 


M 


t  ±3J 


with  either  M  .  or  M.  ■  on  T  it  is  possible  that  M 

‘■i  i1  ij+r'4 

does  not  belong  to  f2  ;  in  such  a  case  we  can  use  to  discretize  V-V 
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a  first  order  scheme  like  in  (5.6)!  or  alternatively  a  centered,  second 


at  Mtj 
order  approximation  like 


(V.Vp)  (M„)  ~  V,  (Mjj)  ^±LL' 


2h 


+  V2(M,j) 


Pij+t  -  Pin 
2h 


(5.8) 


The  boundary  conditions  are  treated  as  in  (5.6)2  .  The  fact  that  the  problems 
under  consideration  may  have  a  fast  dynamics  requires  the  use  of  small  h  and  At  ; 
indeed  as  in  Section  3  and  4  we  can  take  advantage  of  the  fact  that  At  is  small  to 
solve  the  above  discrete  problems  by  successive  over-relaxation  since  that  method 
has  good  vectorization  and  parallelization  properties  (in  practice  few  iterations  will 
insure  convergence  at  each  time  step). 

Numerical  experiments  definitely  show  the  superiority  of  the  second  order 
upwinding  over  the  first  order  method  (it  is  more  accurate,  less  dissipative  and 
almost  as  easy  to  implement). 


5.3.  SOLUTION  OF  PROBLEM  (5.4)  BY  FINITE  DIFFERENCES  AND  A  BACKWARD 
METHOD  OF  CHARACTERISTICS. 

As  discussed  in  [31]  ,  [32]  (see  also  the  references  therein)  the  backward  or 
modified )  method  of  characteristics  can  be  a  most  interesting  tool  for  solving 
advection  dominated  problems. 

The  basic  principle  of  the  method  is  fairly  simple  and  will  be  discussed  on  the 
continuous  problem  only. 

Let's  define  the  total  time  derivative  operator  ~  by 
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(5.9) 


DP  =  3p  , 

Dl  3t 


v-vp 


and  consider  the  characteristic  flow  associated  to  (x  ,  t  )  ]RN  >'  K,  i.c.  the 
N -dimensional  vector  X(t;  x  ,  t)  solution  of  the  ordinary  differential  system 

r 


J  (5.1 0) 

X(t;  x  ,  t)  -  x. 


With  the  above  relations  the  parabolic  equation  (5.4),  can  also  bo  written 


Dp 

Dt 


f  in  Q  X  (0,  T)  , 


(5.1 i) 


and  discretized  along  the :  characteristics  at  time  (n+l)At  by  the  elliptic 
equation 

f 

p"+’  (x)  -  p;  [X  (nAt  ;  x  ,  (n+l)At)] 

- - -  -  -  VV+‘(x)  -  f  (x)  , 

■S  (5.12) 

p”+1  -  gI!+I  on  T  ; 

v. 

more  sophisticated  schemes  can  be  used  (cf.  1331). 

In  practice,  to  compute  p  ‘(x)  —  p  fxfnAt  ;  x,  (n+l)At/j  we  shall  integrate 
(5.10)  numerically  ,  starting  from  a  grid  point  and  track  back  from  (n+l)At 
to  nAt, 

Several  situations  may  occur: 

(i)  If  the  characteristic  curve  crosses  the  boundary  at 
l,-.  (nAt  Cy(n-tl)At)  \vc  shall  replace  At  by  (n+l)At  -  t ‘  and  take  for 

rw 
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the  value  of  g  at  {x*  ,  t*}  whore  x*  is  the  point  at  the  crossing  of  T  and 
of  the  characteristic  curve. 

(ii)  If  X(nAt  ;  x  ,  (n+l)At)  s  f2  it  necessarily  belong  to  a  cell  defined  by 
grid  points;  we  shall  then  use  ,  for  example,  an  interpolation  technique  to  compute 
pn(x)  (sec  Figure  5.1) 


(n+1)  At 


11  At 


Figure  S.l:  Backtracking  Along  the  Characteristics. 


l  ow  order  interpolation  methods  can  lead  to  an  overall  method  which  may  be  quite 

dissipative,  on  the  other  hand  high  order  interpolation  methods  are  costly  in  high 
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dimension  and  not  very  easy  to  code  leading  to  softwares  which  are  not  easy  to 
vectorize  or  parallelize.  We  however  think  that  these  methods  of  characteristics 
are  promising,  but  clearly  they  deserve  a  lot  of  further  investigations. 

The  discretization  of  the  terms  associated  to  the  elliptic  operator  -kV/  in 
(5,12)  is  straightforward  and  is  done  by  the  same  difference  formula  then  in  (5.6), 
and  (5.7);,  .  The  fully  discrete  system  obtained  from  (5.12)  k  then  solved  by 
those  successive  over  relaxation  methods  advocated  in  Section  5.2. 

5.4.  NUMERICAL  EXPERIMENTS. 

All  the  numerical  experiments  are  concerned  with  problem  (5.4)  when 

f2  =  (0,1)N,  f  =  0  ,  g  =  0  .  We  have  compared  here  the  various  methods  discussed 

in  Sections  5.2  and  5.3  including  some  variants  where  one  only  uses  first  order  time 

differencing  combined  to  second  order  upwinding  for  the  space  variables.  We  have 

also  tested  the  variant  of  scheme  (5.12)  where  jias  been  discretized  bv 

ot 

(x  ,  (n+l)At)  ~ 

(5.13) 

p’+I  (x)  -  2p"(X(nAt  ;  x,  (n+UAt).)  +  i  p17'1  (X(Cn-l)At  ;  x  ,  (n+l)Atj)]. 


The  numerical  experiments  have  been  carried  out  for  N=2.3,4,5,6, 


5.4,1.  TWO  DIMENSIONAL  EXPERIMENTS. 

Data:  «  -  (0,  l)2  ,  f-  0  ,  g  -  0  ,  -  =  10'"  , 

V  =  1)  with  7  -  (x-xo)-+(y-y0T  ,  {x0  ,  y0}  =*  1-1,  -1}  , 
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Po(x,y)  = 


16’  x(~x)y  (Jj-y)  if  ix,y)  e  (0,1/2)" 


< 

0  in  Q\(0,l/2)2  . 


The  behavior  of  the  numeric;))  methods  has  been  summarized  in  Table  5.1  ,  below 


Method 

h 

At 

CPU  /time  step(secs.) 

CRAY-XMP 

2nd  order  • 

t/10 

h/2 

0.0016 

upwind! ng  with 

1  /20 

h  n 

0.0036 

1st  order  time 

1/40 

h  /2 

0.0082 

differencing 

1  /80 

h/2 

0.021 

2nd  order 

1  /10 

mm 

0.0017 

upwinding  and 

1/20 

0.0037 

time 

1  /  40 

IIB 

0.00S1 

differencing 

1/80 

IB 

0.020 

Method  of 

1/10 

h 

0.0016 

charac  ten's  tics 

h 

0.0029 

(1st  order  time 

h 

0.0074 

differencing) 

1/80 

h 

0.022 

Method  of 

1/10 

h 

0.0021 

characteristics 

1/20 

h 

0.0057 

(2nd  order  time 

1/40 

h 

0.018 

differencing) 

1  /80 

h 

0.065 

Table  5.1  {Two  dimensional  experiments), 

Figure  5.2  shows  the  trace,  on  the  diagonal  y-x=0  ,  of  the  computed  solution  at 
t“l  i  or  various  values  of  h  (1st  order  time  differencing,  second  order  upwmding); 
Figure  5.3  shows  the  time  evolution  of  this  trace  (computations  done  by  the  above 
method  with  h“l/40).  Figure  5.4  corresponds  to  the  same  experiment  than  in 
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Figure  5.2  except  that  here  we  have  been  using  second  order  time  differencing  and 
upwinding;  we  observe  that  the  resuits  for  h“J-  and  i  are  practically  identical 
and  that  those  obtained  with  h  =  ./20  are  indeed  very  close  of  those  obtained  by 
the  previous  method  with  1*=1  /40  .  Figures  5.5  and  5.6  correspond  to  the  same 
experiment  than  in  Figures  5.2  and  5.3,  except  that  here  one  has  used  the  method  of 
characteristics  of  Section  5.3  (first  order  time  differencing  in  Figure  5.5,  2nd  order 
time  differencing  in  Figure  5.6)  ;  we  observe  that  the  method  of  characteristics 
used  here  (with  bilinear  interpolation  on  the  finite  difference  cells)  is  more 
dissipative  than  the  2nd  order  upwinding  methods;  we  observe  also  that,  coupled  to 
the  method  of  characteristics,  second  order  time  differencing  seems  to  be  slightly 
more  dissipative  than  the  first  order  one. 


5.4.2.  THREE-DIMENSIONAL  EXPERIMENTS. 

Data:  i'2  =  (0,1)'  ,  f~0  ,  g-0  ,  *  -  tO'3,  10'4  and  10' S  ,  V  -  v(l/7)  with 

7  =  «J(x-x0)“+(y-y0)r'+(x-z0)2  ,  {xa  ,  y0  ,  z0)  -  (2,2,2,}  , 


P0(x,  y,  z)  - 


163xyz(l-x)  (1-y)  (1-z)  in  u)  -  (0,1 /2)3  , 


0  in  (1/w 
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Table  5.2  summarizes  some  of  the  numerical  results 


CPU  /time  step  (secs), 

CRAY-XMP 

Method 

h 

At 

<r> 

i 

O 

•H 

\L> 

£  -  lO"* 

£  -  10‘8 

2nd  order 

1/10 

h/2 

0.012 

0.012 

0.012 

upwinding 
and  time 

1  /20 

h/2 

0.059 

0.057 

0.061 

differencing 

1/40 

h/2 

0.28 

0.28 

0.29 

Characteristics 
with  2nd 

1/10 

-  - 

h/2 

0.022 

0.019 

0.017 

order  time 

1/20 

h/2 

0.15 

0.14 

0.13 

differencing 

1/40 

h/2 

1.07 

1.03 

1.03 

Table  5.2  ( Three  dimensional  experiments) 


On  Figure  5.7  (resp.  5.8)  we  have  shown,  at  t  -  4,  the  trace,  on  the  line  x^y-z 
of  the  solution  of  (5.4)  computed  for  various  values  of  h  by  the  second  order 
upwinding  and  time  differencing  method  (resp.  the  method  of 
characteristics  with  second  order  time  differencing).  We  observe  again  that 
the  method  of  characteristics  is  more  dissipative  and  less  accurate  than  the 
upwinding  method  for  which  the  results  at  h  -  1/40  and  h  -  1  '80  are 
practically  identical.  For  those  readers  who  may  be  surprised  by  the  fact  that  the 
three  dimensional  results  show  more  dissipation  (for  the  same  value  of  e)  than  the 
two  dimensional  ones,  we  would  like  to  mention  that  a  Fourier  analysis  would  show 
a  faster  time  decay  to  zero,  due  the  fact  that,  for  the  same  rank,  the  eigenvalues 
of  the  Laplace  operator  are  increasing  functions  of  the  dimension  N  if  one 
considers  as  space  domain  f 2  -  (0,1)^  . 
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Figure  5.9  corresponds  to  the  same  experiment  than  Figure  5.7,  except  that 
e  =  lCT4  ;  Figure  5.10  shows  the  time  evolution  of  the  trace,  on  the  line  x=y=z  , 
of  the  solution  of  (5.4)  computed  by  the  second  order  upwinding  and  time 
differencing  method  for  h  =  1/40  ,  At  =  1/80  ,  e  =  1Q"\  Figures  5.11  and  5.12 
correspond  to  the  same  experiments  than  Figures  5.9  and  5.10,  except  that  we  have 
used  here  the  method  of  characteristics  with  second  order  time  differencing. 
Finally,  Figures  5.13  to  5.16  correspond  to  the  same  experiments  than  Figures  5.9  to 
5.12  except  that  now  e  =  10"s  . 

5.4.3.  FOUR  DIMENSIONAL  EXPERIMENTS. 

Data:  n  =  (0,1)4  ,  f  -  0  ,  g  =  0  ,  e  =  10'3  ,  V  =  V(1 /72)  with 

72  -  Z  (x*  -2)2  if  x  -  lx,}4  , 


and 


lb* 


4 

n  x{  (1  /2  -  Xf) 
i—  \ 


in  w  =  (0,1  /2 r*  , 


PoCx)  = 


o  in  n  \(v 


Table  5.3  summarizes  some  of  the  numerical  results 


Method 

h 

At 

CPU /time  step,  CRAY-XMP 

2nd  order 

1/10 

h/2 

0.12  sec. 

upwinding 
and  time 

1/20 

h/2 

1.1  sec. 

differencing 

1/32 

h/2 

5.4  sec. 

Table  5.3  (4f'^  dimensional  experiments) 
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of  the 


Figure  5.17  shows  the  trace,  at  t-4  and  on  the  line  Xj  -  x2  -  x3  -  x.,  , 
solutions  computed  by  second  order  up-winding  and  time  differencing  ,  for 
the  above  values  of  h  and  At  ,  we  observe  the  good  agreement  between  the 
soltuions  for  h-1/20  and  h  -  1/32.  Figure  5.18  shows  the  time  evolution  on  the 
line  Xj  -  x2  ~  x3  -  x4  of  the  solution  computed  by  the  above  method  with  h~l  /32 
and  At  -  1/64. 


5.4.4.  FIVE  DIMENSIONAL  EXPERIMENTS. 

Data;  W  -  (0,l)s  ,  f  -  0  ,  g  -  0  ,  e  -  10"3  ,  10'*  , 


V  -  V  (7*3)  ,  with 


l\x)  -  2  (*t-2)a  if  '  x  -  , 

t“i 


|165  n  x,(l  /2  -  x,)  in  w-  (0,1  /2)°  , 


Po(x)  -4 


0  in  {2\w  . 


Since  we  did  not  have  access  to  the  full  memory  of  the  CPAY-XMP  we  only 
considered  h  -  1/10  and  then  At  -  h/2  -  1/20.  Using  the  second  order 
upwinding  and  time  differencing  method  the  CPU/time  step  ratio  was  0.9  sec.  if 
e  -  10‘6  . 

The  results  displayed  on  Figures  5.19,  5.20  (traces  on  the  lino  X!  ~  x~  ■-  xv  ~  x, 
-  x5)  show  that,  as  expected,  the  finite  difference  mesh  is  to  coarse.  Extrapolating 
from  the  results  in  lower  dimensions  the  olution  of  the  above  test  problem  would 
require  at  least  h  -  1/32. 
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5.4.5.  SIX  DIMENSIONAL  EXPERIMENTS. 

Data:  f  -  0  ,  g  =  0  ,  c  -  10'9  ,  h  -  1  /10  ,  At  -  1  /20  ,  and 


V  -  10  V(7-4)  with 


6  ,  6 
YM  -  S  (X,  -  2)2  if  x  -  {x,>  , 


Po(x)  - 


166  n  x{(l  /2  -  x,)  in  w  -  (0  ,  1  /2)6  , 

i-=l 


0  in  Q\w  . 


This  test  case  is  definitely  a  limit  one,  at  least  with  the  class  of  computers  that 
the  present  authors  can  access;  using  solid  state  device  ,  the  CPU/time  step  ratio 
is  here  173  seconds.  It  is  clear  that  by  a  very  sophisticated  coding  and  still  using 
the  same  method  ( second  order  unwinding  and  time  differencing )  the  above 
performances  can  be  improved  but  it  is  clearly  the  type  of  situations  where 
massively  parallel  computing  is  needed. 


5.5.  FURTHER  COMMENTS. 

In  the  particular  case  where  Q  (U,l)^  dimensional  splitting  methods  can 
be  used  to  increase  the  degree  of  parallelism  of  the  problem  under  consideration. 
We  observe,  for  example,  that  if  N  -  4  ,  then 


i'i-  i  B  *  r  p  . 

;  -  1  3\  .  i  '?  UX: 


which  suggests  to  apply  the  general  methods  of  Section  2  with 


F 


«•< 

ftV< 

!: 


t'h 

*►. 

*  ' 


m 

■  Jt 

.yr 
•v.- 
y 
;  V. 

9Sn 

i 

vt* 

2i\ 

& 

ll 

$ 


m 

EVi* 

S'l 


','1* 

$ 


VVW.NJ* 


p 

i 

Jk 

•is 
* 
V.  * 

A 

Sv* 


It  is  our  intention  to  start  a  campaign  of  numerical  experiments  to  test  the 


validity  of  this  approach. 


6.  CONCLUSION. 

Operator  splitting  methods  definitely  provide  efficient  methods  for  solving 


m 


numerically  mathematical  problems  modeled  by  parabolic  equations  or  which  can  be 
reduced  to  the  solution  of  such  problems.  In  the  case  of  problems  in  very  high 
dimension  (N>4)  the  validity  of  this  approach  needs  to  be  tested  through  further 
numerical  experiments.  An  important  conclusion  which  appears  already  is  that  the 
evolution  aspect  of  these  problems  makes  relaxation  techniques  very  valuable 
solution  methods  if  one  uses  implicit  schemes  for  the  time  discretization. 
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Algorithms  for  Rational  Spline  Curves 


Klaus  Hollig1 
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ABSTRACT.  The  theory  of  univariate  splines  is  well  understood.  However,  apply¬ 
ing  the  standard  techniques  for  spline  functions  does  not  make  use  of  some  important 
features  of  piecewise  polynomial  curves.  Since  curves  are  invariant  under  reparametriza- 
tion,  the  smoothness  conditions  for  splines  are  less  restrictive  and  standard  approximation 
methods  can  be  improved.  This  note  discusses  rational  cubic  spline  curves  and  describes 
in  particular  two  basic  algorithms:  the  construction  of  smooth  splines  from  control  points 
and  Hermite  interpolation. 


1.  Introduction 

We  first  review  briefly  two  basic  algorithms  for  “standard”  cubic  spline  curves  as  a 
preparation  for  the  generalizations  to  be  discussed  in  the  following  sections.  These  algo¬ 
rithms  are  best  described  using  the  Bezier  form  for  polynomials  which  allows  a  particularly 
simple  characterization  of  smoothness  constraints  for  splines.  The  Bezier  coefficients  b  of 
a  cubic  polynomial  p  are  defined  by 


3 

p(0  =  B "(o*  °  <  t  <  u 

i/—0 


where  :=  (^)t^(l  —  t)v  are  the  Bernstein  polynomials.  Therefore,  as  is  illustrated 

in  Figure  1,  a  piecewise  cubic  spline  curve  p  can  be  represented  by  a  sequence  of  Bezier 
coefficients 

K>  u  =  0,...,3,  y  =  0,...,J. 

It  is  assumed  that  b3?j  ’1  =  6q,  i.e.  that  the  curve  segments  join  continuously.  Continuity 
of  the  first,  and  second  derivatives  of  the  parametrization  is  equivalent  to  the  conditions 

b  j  bf->t  —  bo  bo 

(6+  -  6+)  -  (i,--  b-)  =  (f-3  -  b-2)  -  (b;  - b n 

where  b~  denote  the  Bezier  coefficients  of  two  adjacent,  curve  segments.  The  particular  form 
of  these  conditions  yields  a  very  simple  algorithm  for  constructing  the  Bezier  coefficients 
of  twice  continuously  differentiable  spline  parametrizations  from  control  points  (cf.  Figure 
2). 


1  supported  by  the  United  States  Army  under  Contract  No.  DAAG29-80-C-0041  and 
sponsored  by  the  National  Science  Foundation  under  Grant  No.  DMS-8351187 
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Figure  1.  Bezier  polygon  of  a  cubic  spline  curve 

Algorithm  1.  (c  =>■  b)  The  Bezier  coefficients  b  corresponding  to  a  sequence  of 
control  points  c  are  given  by 

b{  :=  (2c>+1  +  cj+2)/3,  b\~l  :=  (2cJ+1  +  cj)jZ 

btl=bi-.=  [b{-l  +  bi)/2. 

Combining  the  steps  in  Algorithm  1,  /J  :=  63” 1  —  b30  can  be  expressed  in  terms  of  the 
control  points, 

p  =  (ci  +  4cy+1  +  c^+2)/6  (1) 

which  yields 


Algorithm  2.  {/  =s>  c)  The  control  points  c  of  the  natural  spline  interpolant 

(which  has  zero  curvature  at  the  endpoints)  corresponding  to  the  data  fK  j  —  0 , . . . ,  J, 
are  computed  by  solving  the  linear  system  (l)  for  j  =  1, . . . .  J  —  1  with  the  end  conditions 


c1  =  /°,  c°  —  2c1  —  c2 

cJ-2  =2cJ+l-cJ. 

Figure  3  shows  an  example  which  illustrates  a  slight  disadvantage  of  the  method:  possible 
oscillations  near  inflection  points. 
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Figure  3.  Natural  spline  interpolant 

A  piecewise  rational  curve  is  represented  by  a  sequence  of  coefficients  and  weights, 

^  =  0, . . . ,  3,  j  = 

where,  for  continuity,  1  =  6q.  To  characterize  higher  order  smoothness,  we  recall  the 
definition  of 

Smoothness  for  Curves.  Smoothness  of  a  curve,  f  ^  f(t)  6  1R3,  is  characterized 
in  terms  of  differentiability  with  respect  to  arclength  s  ;=  Since  dtjds  —  \j\f 

where  |  |  denotes  the  length  of  a  vector,  the  first  and  second  derivatives  of  /  with  respect 
to  arclength  are  given  by 

i'-nw 

=  d/'iV"  -  (/'  ■  /">/')/ in4- 

Taking  the  cross  product  of  the  second  equation  with  this  means  that  C2-continuity 

is  equivalent  to  continuity  of  the  vectors 

-  f,  :=  /7l/'|,  N/:=/#x/'7l/f. 

The  vector  £  is  the  unit  tangent  vector,  k  is  the  curvature  and  r)  is  the  binormal  vector 
which  is  a  unit  vector  orthogonal  to  the  osculating  plane. 
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Figure  2.  Control  polygon,  Bezier  coefficients  and  corresponding  spline 
curve 


While  straightforward,  the  above  algorithms  do  not  make  use  of  the  additional  flex¬ 
ibility  due  to  the  weaker  smoothness  constraints  for  curves.  This  observation  has  led  to 
the  development  of  /7-splines  and  interesting  new  approximation  and  design  techniques  (cf. 
[BBB80],  [B685]).  So  far,  the  new  geometric  ideas  have  been  primarily  applied  to  polyno¬ 
mial  splines.  We  discuss  in  this  note  the  generalization  of  the  basic  algorithms  to  rational 
cubic  splines.  First,  we  describe  the  rational  Bezier  form  in  Section  2.  Then,  in  Section  3 
we  discuss  the  analogues  of  Algorithms  1  and  2.  Section  4  lists  MACSYMA  computations 
which  establish  the  main  result  of  this  note. 


2.  Rational  Bezier  form 


We  review  the  definition  and  some  basic  facts  about  the  Bezier  form  and  refer  to 
IFP79S  for  details.  The  Bezier  form  of  a  rational  cubic  parametrization  r  is  defined  as 


r(t)  = 


P 

<3 


E)Lo  Bv{t) 

Ei/=o  wv  B.(t) 


0  <  t  <  1, 


(2) 


where  the  coefficients  bv  are  vectors  in  1R3  and  the  weights  wv  are  positive  numbers.  The 
homogeneous  form,  i.e.  multiplication  with  the  weights  in  the  numerator,  simplifies  the 
algebra  and  geometric  interpretation.  As  in  the  polynomial  case,  the  control  polygon  is 
tangent  to  the  curve  at  the  end  points.  The  weights  control  the  influence  of  the  corre¬ 
sponding  coefficients,  i.e.  increasing  wy  “pulls”  the  curve  towards  the  coefficient  bu  as  is 
illustrated  in  Figure  4. 
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Figure  4.  Rational  Bezier  form 


Computing  the  vectors  £  and  ij  for  the  parametrization  (2)  at  the  endpoints  gives 

^  bi  ~  bo  / ^  2  ^0^2  (6i  -  bo)  *  (^2  -  fri) 

M°}  '  (,H,)r(0)  ‘  3  \bT^ 

"  1  _  ('C’’)'(1)  "  3  “SI - \i^h? - ' 

Therefore,  two  adjacent  curve  segments  with  Bezier  coefficients  6±  and  weights  w±  join 
twice  continuously  differentiable  at  63  =  6q  if  the  following  two  conditions  are  satisfied: 

(Cl)  6j .  63  =  6q  ,  6j"  are  collinear; 

(C2)  6j",  6j  ,  67  =  6J ,  6^,  6j  lie  in  a  half  plane  and  the  parallelograms  R±  in  Figure  5 
satisfy 

area  (i2_)  area(i?+) 

KH  I63  -  6a  !»  =  (ti^r)2  !6r-6?!3'  (4) 


3.  Control  points  and  interpolation 


The  geometric  description  of  the  smoothness  conditions  easily  yields  the  analogue  of 
Algorithm  1  which  is  a  variant  of  the  corresponding  method  in  the  polynomial  case  [B685]. 
Denote  by  0±  and  6±  the  relative  length  of  adjacent  line  segments  as  is  indicated  in  Figure 
5.  For  example. 


b7  —  60  :  63  -  6^'  =  :  6_, 


r 


4: 


k 


feSi 

& 


A| 


Figure  5.  Geometric  smoothness  constraints 


Then,  since  area(iZ+)  :  area(i?_)  =  (£+/?_)  :  (£_/?+),  condition  (4)  becomes 


S2  :=  0 6+/6. )2  =  l/3_/0+) 


WqV)£  (u>2  )2 


«) 


2  in. 


wl  w3 


(5) 


This  means,  one  can  select  the  points  6^,  b3  ,  6^,  6^  and  the  weights  tv 


by  ,  b 2  ,  6^,  6^  and  the  weights  tv*  essentially 
arbitrarily  and  it  is  then  possible  to  select  6,  and  hence  b3  =  b^,  so  that  the  smoothness 
conditions  (C)  are  satisfied.  This  yields  the  following  algorithm. 


Algorithm  I.  (c,  w,  0  =>  b)  The  Bezier  coefficients  of  a  piecewise  rational  spline 
curve  corresponding  to  the  sequence  of  control  points  cJ ,  weights  i v3„  and  parameters  03±  >  0 
are  given  by 

b{  :=  ((1  +  0l+2)c 3  +  1  -f  0i+1c3+2)/{l  +  0l+1  -  33S2) 
b 2~l  :=  ((1  -  0l)c3*1  -  3l+1)c3)/{  1  +  0 3_+l  -  3i) 

H'1  =  6£  :=  (^“16J2_1  -  6i)/(l  - 

where  6'7""1  is  defined  in  terms  of  ir;.  according  to  (5). 

Algorithm  1  is  a  special  case  corresponding  to  ti'l  =  0{,  =  1.  The  weights  tv  and  the 
parameters  0  permit  local  control  of  the  “shape”  of  the  curve  while  keeping  the  control 
points  fixed.  This  is  illustrated  in  Figure  6.  Decreasing  0  increases  the  curvature  at 
the  knots  and  the  curve  approaches  the  “control  polygon”  which  connects  the  points  cJ . 
Increasing  a  particular  weight  stresses  the  influence  of  the  corresponding  Bezier  coefficient. 
If  this  additional  flexibility  is  not  needed,  the  parameters  can  be  set  according  to  suitable 
optimality  criteria. 


Figure  6. 


Control  points  and  corresponding  rational  spline  curve  for  /?  = 
1/4,  1,  4 


Algorithm  2  for  interpolation  requires  the  solution  of  a  linear  system,  i.e.  changes  in 
the  data  have  a  global  influence.  Using  the  additional  degrees  of  freedom  due  to  the  weaker 
smoothness  constraints,  it  is  possible  to  construct  smooth  interpolants  by  a  local  method. 
This  method  is  suggested  by  the  expressions  (3)  for  £  and  r?.  Setting  Si  :=  |&i+2»  —  &2,|, 
/'  :=  r(j),  £*  :=  £r(t),  (KTf)'  :=  («»?)»■(*)  for  t  =  0,1  and  substituting 

h  -  h  =  (/‘  -  /°)  -  «of°  -  i,(‘, 

the  equations  for  ktj  in  (3)  can  be  rewritten  as 

M‘  =  (-)’'*?  »  (/‘  -  /°)  +  °ii '  *  f°.  ■  =  o,  l,  (£) 


where 


2  WiWz+i  1 


ox  QiS\—i. 


(6) 


Since  both  sides  of  the  t-th  equation  are  orthogonal  to  the  equations  (E)  are  equivalent 
to  a  4  x  4  linear  system  for  g  and  a.  This  system  has  a  solution  with  g.  a  >  0  if 

(A)  r/'  lies  in  the  interior  of  the  cone  spanned  by 

(-)T  *  (Z1  -  fo)  and  f1  x  f°,  i  =  0. 1. 

Choosing  w j  :=  u>2  :  =  0.  the  remaining  weights  w 0,^3  and  the  parameters  S  can  be 
expressed  in  terms  of  g  and  a. 


S 1—1  —  gi,  u>3,  —  (3/2 )SX  jt. 


(7) 


The  corresponding  method  described  in  Algorithm  II  is  a  generalization  of  Hermite  inter¬ 
polation. 


Figure  7.  Rational  spline  interpolant  of  a  helix 


Algorithm  II.  (/,  k rj  =>  b,  w)  The  Bezier  coefficients  bv  and  weights  w0,w3  of 
the  j-th  segment  of  a  piecewise  cubic  rational  spline  r  j  which  matches  the  unit  tangent 
vectors  and  the  vectors  [kt])*  at  the  points  /J  can  be  determined  by  solving  the  system 
(E)  with 

r  :=  P+\  fl*  :=  n*” 

provided  that  condition  (A)  holds. 

The  following  Theorem  shows  that,  for  data  corresponding  to  a  smooth  curve,  con¬ 
dition  (A)  is  satisfied  if  the  interpolation  points  /;  are  sufficiently  close.  Moreover,  the 
interpolant  is  of  high  order  accuracy  and  has  good  shape  preserving  properties  for  smooth 
data.  This  is  illustrated  in  Figure  7.  If  condition  (A)  is  not  valid  for  a  particular  curve 
segment,  then  the  given  curvatures  and  binormals  cannot  be  interpolated  and  have  to  be 
modified  or  possibly  chosen  differently  for  adjacent  segments.  An  interesting  ;open  prob¬ 
lem  is  whether  for  given  points  /  the  vectors  $  and  nr)  can  be  chosen  so  that  (A)  is  valid 
and  the  resulting  scheme  remains  accurate  and  shape  preserving. 

Theorem.  Assume  that  the  data  fJ.  .  tjj  in  Algorithm  II  correspond  to  a  smooth 
curve  /  with  nonvanishing  curvature  k  and  torsion  r  (cf.  FP79.  p.  102  for  definitions). 
If  the  distance  h  :=  max  /J  -  /J-1l  between  adjacent  points  is  sufficiently  small,  then 
for  each  pair  of  adjacent  points,  condition  (A)  is  satisfied  and  hence  the  system  (E)  has  a 
unique  solution  with  o,<r  >  0.  Moreover,  the  corresponding  piecewise  rational  interpolant 
r f  is  6-th  order  accurate,  i.e. 


dist(/.r/)  =  0{h6). 


For  planar  curves,  a  similar  result  was  obtained  in  [BHS87]  for  interpolation  with 
piecewise  cubic  polynomials.  For  the  rational  case,  the  proof  is  somewhat  simpler,  since 
the  weights  w  and  the  parameters  S  can  be  expressed  explicitly  in  terms  of  the  data. 

Proof.  Consider  a  typical  curve  segment  of  the  interpolant,  e.g.  corresponding  to 
the  end  points  f°,  f1.  Without  loss  we  assume  that  /  is  parametrized  with  respect  to 
arclength  s  and  that 

f°  =  [  o  0  0],  f°=[l  0  0],  *7°  =  [  0  0  1],  (8) 

and  Z1  :=  f{s).  Denote  by  r(t,s)  =  p(t,s)/g(*,.s),  0  <  t  <  1,  the  rational  interpolant.  We 
will  show  that 

(i)  for  sufficiently  small  s,  the  system  (E)  has  a  unique  solution  with  >  0; 

(ii)  q(t,0)  =  1; 

(iii)  fdtri(t,s)/s)  =1; 

V  /|s=0 

(iv)  dJ[p(M),  q(t,s)}  =  0(s‘),  I  =  1,2,3. 

Assertions  (i)  and  (ii)  guarantee  that  r  is  well  defined  for  small  s,  i.e.  as  the  distance  of 
the  points  f°  and  / 1  becomes  small.  Assertion  (iii)  implies  that  the  derivative  of  the  first 
coordinate  x  =  rx  of  r  satisfies 

c0s  <  dtri(t,s)  <  cxs  (9) 

for  some  constants  cu  and  s  sufficiently  small.  In  particular,  the  function  rx  is  monotone 
increasing  in  t.  With  fx  denoting  its  inverse,  i.e.  x  =  rx  (f  x(x,  s),  s).  the  rational  interpolant 
has  the  equivalent  parametrization 

x->R{x,s):=\  x  r2(fI(x,s),s)  r3(f  i(x,  s),  a)  ; . 

Similarly,  /  can  be  parametrized  with  respect  to  the  first  coordinate. 

x  —  F(x). 

Since  the  interpolation  conditions  are  invariant  under  reparametrization.  the  unit  tangent, 
curvature  and  binormal  of  R  and  F  match  at  xo  :=  0  and  Xj  :=  ri(l.s)  =  fi(s).  Using 
that  i?i(x)  =  F](x)  =  x.  this  implies  that  the  derivatives  of  R  and  F  match  at  these 
points  up  to  second  order.  From  the  standard  error  estimate  for  interpolation  of  functions 
it  follows  that 

R(x,s)  -  F{x)\  =  0((x,  -  xo)6)  =  0(s6) 

provided  that  the  derivatives  of  R  with  respect  to  x  up  to  order  6  are  bounded,  uniformly 
in  s.  This  follows  from  (iv).  To  see  this  we  note  that 


and  compute  the  derivatives  of  Ru  inductively  using  the  chain  rule.  This  shows  that 
is  a  sum  of  terms  of  the  form 

r(k)  (t-l) . _(^m)  f  (r(1)\k  +  tl  +■  ->-lm 

TU  '1  T\  /  ) 

where  superscripts  denote  differentiation  with  respect  to  t  and  all  functions  are  evaluated 
at  (t,s)  with  t  =  s).  Since  is  a  sum  of  terms  of  the  form 

p[k)g{tl) . q^tm)jqm+l,  j  =  k  +  £i  +  •••  +  £„,, 

the  boundedness  of  Ru  is  a  consequence  of  (ii),  (iv)  and  (9). 


It  remains  to  verify  assertions  (i)-(iv).  This  requires  elaborate  Taylor  expansions 
which  are  done  via  MACSYMA  as  is  described  in  the  final  section. 


4.  MACSYMA  computations 

Below  we  list  a  MACSYMA  program  for  proving  (i)-(iv)  of  the  previous  section.  The 
computation  is  divided  into  four  main  steps:  Taylor  expansion  of  the  data;  solution  of 
equations  (E):  Bezier  form  of  r;  verification  of  (i)-(iv).  To  speed  up  the  computations, 
we  use  Taylor  expansion  to  simplify  intermediate  results.  The  order  of  truncation  will  be 
justified  at  the  end  of  this  section. 

Auxiliary  functions.  The  following  auxiliary  functions  will  be  used  in  the  program: 
(cl)  is  de  Casteljau’s  algorithm  for  evaluating  a  polynomial  at  t  from  its  Bezier  coefficients 
6;  (c2)  is  the  vector  product;  (c3)  generates  the  first  n  ~  1  terms  of  a  power  series  with  co¬ 
efficients  a,;  (c4)  computes  the  Taylor  expansion  of  the  solution  of  the  differential  equation 
x'(t)  =  x(0)  =  xO. 


(cl)  bezier-form(b.n.t)  :  = 

if  n— 0  then  row(b.l) 

else  t*bezier  Jorm(submatrix(l.b).n-  l.t) 

—  (l-t)*bezier  _form(submatrix(n—  l.b).n-  l,t)$ 


(c2)  cross -product(a.b)  :  = 

ia,2i*b[3i  —  a[3'*b[2j,  a[3j*b[l]— a!lj*b(3l,  ajl  *b[2]|— a.;2]*b[l]]$ 

(c3)  power  .series  ( a, n.tl  := 
if  n=0  then  a[0 

else  power .series(a,n-l.t)  —  a  n  *tn  factorial(n)S 
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a 


(c4)  solve-ode(f,xO,n,t)  := 
if  n=0  then  xO 

else  (xl:  solve.ode(f,xO,n-l,t), 

xl  -r  subst(0,t,diff(f(xl),t,n-l))*ta/factorial(n))$ 

Taylor  expansion  of  the  data.  The  data  at  the  left  endpoint  are  given  by  (8).  To 
obtain  Taylor  expansions  for  fx  =  /(s).  =  f(-s)  and  (xf?)1  =  k(s)t](s),  (clO)  approxi¬ 

mately  solves  the  Frenet  differential  equations  [FP79,  p.l03j, 

/'  =  £ 

=  rri  - 

n'  =  -rg, 

where  f  with  f  (0)  :=  [  0  1  1 1  is  the  normal  vector.  This  yields  Taylor  expansions  for  the 

data  in  terms  of  the  Taylor  coefficients  u,  and  v,  of  curvature  and  torsion  (cf.  (c6),  (c7)). 


(c5)  (kappa(s)  :=  power_series(u,5,s),  tau(s)  :=  power  jseries(v,5,s))$ 

(c6)  kappa(s); 

(d6)  usss/l20  +  U4S4/ 24  -I-  U3S3/6  +  u^s2  j2  -I-  iti-s  +  uo 

(c7)  tau(s); 

(d7)  u5s5/120  +  V4S4/ 24  +  vzs3/6  +  v^s2 /2  +  v\s  +  Vo 

(c8)  (f}0):  [0,0,0],  xi[0j:  [1,0,0],  zeta[0j:  [0,1,0],  eta[0]:  [0,0,1])$ 

(c9)  g(a)  :=  [a[2],  kappa(s)*a[3],  tau(s)*a[4]-kappa(s)*a[2j,  — tau(s)*a[3  ,$ 

(clO)  ffs:  solve-ode(g,!f(0  ,  xi[0] .  zetajOj,  etaiOi] ,6 ,s) $ 

(ell)  (f[  1  ] :  ffs i  1 . ,  xi ;  1 : :  ffsi 2  .  zeta  l  :  ffs : 3 , .  etajl !:  ffs i 4 j ) $ 

(cl2)  (kappa, 0  :  kappa(O).  kappa  1  :  kappa(s))$ 

Solution  of  equations  (E).  All  vectors  in  the  ?-th  equation  in  (E)  are  orthogonal 
to  and  therefore  the  i-th  equation  is  equivalent  to  the  2x2  system 

eqnh  =  eqnl2  g,  +  eqnl3at ,  u  =  0,1,  (10) 

obtained  in  (cl3)  by  forming  the  dot  product  of  the  i-th  equation  with  the  vectors  rj1  and 
f1-1.  (cl4)  solves  the  system  (10)  by  backward  substitution,  using  that  eqn33  =  0.  The 
parameters  6  and  weights  w  are  computed  in  (c  1 5)  and  (cl8)  according  to  (7). 
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(cl3)  for  i:0  thru  1  do 

eqnjij:  (matrixl:  matrix(eta[i],xi[l-i]), 

matrix2:  matrix(kappa{ii*eta[ij, 

(-l)‘*cross.product(xi[ij,f!l]-f(0]), 
cross .product(xiIl  ,xi[0j)), 
ratsimp(taylor(matrixl.transpose(matrix2),s,0,6)))$ 

(cl4)  for  i:0  thru  1  do 

(rho[ij:  eqn[i][2,l]/eqn[i][2,2], 

sigmaji]:  (eqn!i][l,l]-eqn[ij[l,2]*rho[i])/eqn[i][l,3})$ 

(cl5)  for  i:0  thru  1  do 

delta[l-ij:  taylor(ratsimp(sigma[i]/rholi]),s,0,3)$ 

(cl6)  delta{0{; 

(dl6)  s/3  -  (u0ui  +  2vcui)s2/(36u0uo)  +  (9u2uot>2 

-i-12uoUqU2  -  10u2vf  +  2uot>ouit;1 

-10r2u2  -+-  6u2fo  +  6uoVo)s3/(540u2Vo) 

(cl7)  delta[lj-deltaiOj; 

(dl7)  -(u0vi  +  2v0u1)s2/(l8u0t;o)  -  (u£u0*>2  +  2u0t'2«2 

-u^fj  -  2vqu\)sz j (36u2i;q) 

(cl8)  for  i:0  thru  1  do 

w[ij:  taylor(ratsimp((3/2)*rhofi]*deita(ij2),s, 0.2)8 

(cl9)  w[0{; 

(dl9)  1  ■+■  (24u2uoW2  12uo^ou2  —  35u2r2  -  8uot’oUjVi 

-20u2u2  +  36u2t’o  *  36ugt;2)s2/(720u5v2) 

(c20)  w  1  -w0.; 

(d20)  0 

Bezier  form  of  r.  Statements  (c21)  and  (c22)  define  the  polynomials  p(-.s)  and 
q(-,s)  using  de  Casteljau's  Algorithms  in  terms  of  the  Bezier  coefficients. 


(c21)  p:  (wb:  matrix(w;0  *f0  ,  f  Oj+delta'O  *xi  0  , 

f  1  -  d  e  1 1  a  i  1  i  *  xi  1 1  ,  w  1 !  *  f  1  ) . 
ratsimp(taylor(bezier.form(wb.3.t)  1  .s,0,2)))S 

(c22)  q:  (w:  matrix( ’w.Olj,  1,,  l|.  wil  ). 

ratsimp(taylor(bezieribrm(w.3.t)  1.1  ,s.0.2)))S 


Verification  of  (i)-(iv).  As  is  shown  by  (c23)-(c26),  the  dominant  part  of  the 
system  (10),  as  s  — » >  0  is  given  by 


tto  _  Uo&2/2  -UoS  Qi 

UqVqs2/2  u2vqs4/12  0  Oi 


which  proves  (i).  Clearly,  (c27)  proves  (ii).  (c28)  computes  the  numerator  of  dtr\{t,s) / s, 
where  rx  =  pi/q,  and  evaluates  it  at  s  =  0.  In  conjunction  with  (ii)  this  establishes  (iii). 
Assertion  (iv)  is  equivalent  to  the  statement  that 


5t52[P>9](M)  =  0,  J  <  «  <  3. 


This  is  checked  by  (c29)  which  displays  (5{[p(f,s),9(f,s)])/st  1  evaluated  at  s  =  0. 


(c23)  taylor(eqn[0][l],s,0,2); 

(d23)  [uo,  uo«2/2,  -u0s-uis2/2] 


(c24 )  taylor(eqn j  1  ]  j  1  j - eqn [Oj  T  ■  ,s,0,2) ) ; 
(d24)  (ui«-t-u2<2/2,  0,  Oj 


(c25)  taylor(eqnj0][2j,s,0,4); 

(d25)  [v0u2s2/2  -r  (vju^ 2u,u0u0)s3/6  -  (u0«o  +  (wo  “ 

-(3u2r0  +  3u1u1)u0)s4/24,  v0u2s4/12,  Oj 


(c26)  taylor(eqn[ l] (2j  — eqnjO]  [2! ,s,0,4) ; 

(d26)  ((u^vi  +  2u0u0u1)s3/6  +  (uqU2  +  2uoVou2 

-i-4uoUiVi  +  2v0uJ)s4/12,  0,  0] 


(c27)  subst(0,s.q); 
(d27)  1 


(c28)  subst(0.s.ratsimp((diff(pTj,t)*q-p;T*diff(q,t))/s)): 

(d28)  1 

(c29)  for  i:  1  thru  3  do 

disp(subst(0.s.ratsimp(diff(ip.q  ,t,i)/s^1_1)))): 
0.  0.  Oj,  0 
0.  0.  0  ,  0j 
i0,  0.  0  .  0] 

(d29)  done 


It  remains  to  justify  the  various  orders  of  truncation  in  the  intermediate  Taylor  ex¬ 
pansions.  Since  multiplication  never  decreases  the  order  of  validity  of  truncated  Taylor 


■y 


expansions,  we  must  only  consider  (cl5)  and  (cl8).  To  indicate  the  range  of  significant 
terms  in  an  expansion 

V?(s)  =  ipjs}  +  (pj+ls3  +  1  -r 


we  use  the  notation 


<P  ~  [j, 


if  the  coefficients  up  to  index  J  agree  with  the  exact  expansion  of  <p.  By  (11),  and  since 
the  data  are  computed  exact  up  to  order  6  by  (clO),  the  coefficients  in  the  system  (10) 
satisfy 


.•  [[0,6]  [2,6]  [1,6]' 

~  [[2,6]  [4,6]  0 


This  shows  that 


8  ~  [2, 6]/{4, 6]  ~  [—2,0] 
a  ~  ([0,6]  -  [2,6]  *  [— 2,0])/[l,6j  -  [-1,1] 
*  ~[-l,l]/[-2,0]~[Ml 
w  ~  [ — 2, 0]  *  [l,3]2  ~  [0,2] 


and  hence  all  subsequent  expansions  are  exact  at  least  up  to  order  2  which  is  what  is 
needed  for  the  proof  of  (iv). 
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Abstract.  Given  a  strictly  convex  function  of  two  variables  f(x,y),  a 

C1  box  spline  series  approximant  of  this  function,  based  on  data  {f(ih,Jh}> 

given  at  px>ints  uniformly  spaced  by  h  in  the  x  and  y  directions,  will  also 

be  convex  if  h  is  sufficiently  small.  Explicit  numerical  bounds  on  h  are 

provided  in  this  p>ap>er  which  guarantee  convexity  of  these  box  spline  series 

approximants.  The  bounds  are  of  the  form  h  s  ce/L  where  e>0  is  the  minimum 
2 

value  of  Duf,  L  is  a  Lipschitz  constant  associated  with  the  continuity  of 
the  second  derivatives  of  f,  and  the  constant  c  depjends  on  the  particular 
box  spline  approximant  being  used. 

I.  Formulation 

Given  data  {f(lh,  Jh)>  *  { ^ > ,  V  (i,J)  e  Z2,  representing  values  of  a 
smooth  function  f(x,y)  at  the  set  of  px>ints  {(ih,jh)>  with  uniform  spacing 
of  h  in  the  x  and  y  directions,  and  a  C1  locally  suppxjrted  box  spline 
0(x,y),  there  are  various  ways  of  determining  the  coefficients  {c^}  of  a 
box  spline  series 

«»  -  E 

l  .  J 

such  that  the  order  of  approximation  of  f  by  sh  is 
|f(x,y)-s  (x.y) |  =  0(h3) 

h 

and  the  partial  derivatives  of  f  are  simultaneously  approximated  with 
|Daf  (x,y)-Dash(x,y)  I  =  0(h3‘*a']  ,  |al=l,2  . 
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In  particular,  if  we  denote  by  the  second  derivative  of  f  in  the 

direction  of  the  unit  vector  u,  we  have 
(2) 


|D2f(x,y)-D2s  (x,y)|  =  0(h). 

u  u  n 


If  f(x,y)  is  a  strictly  convex  function,  say  Duf(x,y)  £  c  >  0  for  some 
constant  e  Independent  of  u  and  (x,y),  then  for  sufficiently  small  h,  the 
approximation  property  (2)  implies  that  the  spline  approximant  s  is  also 

h 

convex.  The  purpose  of  this  paper  is  to  provide  explicit  bounds  on  h  which 
guarantee  convexity  of  s  . 

The  function  f  is  assumed  to  be  strictly  convex  and  to  have  Lipschitz 
continuous  second  derivatives,  which  is  sufficient  for  (2)  to  hold.  More 
specifically,  we  assume 


(3) 


a)  f  €  C2(R2) 

b)  D2f(x,y)  a  e  >  0  V  u.x.y 


c)  |Daf (x  ,  y  )-Daf (x  ,  y  )  I  s  L[(x  -x  )2+(y -y)2]1/2  ,  |a|  =  2. 

The  remainder  of  the  paper  is  as  follows.  In  Section  II  we  introduce 
box  splines  and  the  class  of  approxlmants  whose  convexity-preserving 
properties  will  be  studied.  In  Section  III  we  illustrate  the  estimation 
techniques  which  lead  to  the  bounds  on  h  which  preserve  convexity.  In 
Section  IV  these  bounds  are  presented.  We  do  not  present  the  details  of  the 
derivation  of  the  bounds  in  this  paper.  Rather,  the  reader  is  referred  to 
the  paper  [CDR2]  for  a  detailed  treatment  of  shape  preservation  in  bivariate 
spline  approximation. 

II.  Box  Splines  and  Spline  Approxinants 

Let  V  =  (v  ,v  ,..,v>  be  a  set  of  (generally  nondistlnct)  integer 

12  n 

2  2 

vectors  in  R  which  also  spans  R  .  The  box  spline  ^y(x,y)  is  defined  to  be 
the  probability  density  of  the  random  linear  combination  I^v  where  the 
random  n-tuple  (t  ,t  ,..,t  )  in  Rn  is  uniformly  distributed  in  the  "box" 

12  n 

[-1/2.1/2]".  Box  splines  are  locally  supported  and  piecewise  polynomial. 
Explicit  formulas  for  the  polynomial  pieces  can  be  recursively  calculated 
fairly  easily;  see  [CL]  for  details  of  the  construction  and  for  several 
examples  important  to  the  analysis  described  in  Section  III. 

For  reasons  of  simplicity,  we  restrict  ourselves  as  to  the  possible 


members  of  V.  We  define  the  vectors  e=(l,0) 


V(0’1> 


e  =(1,1)  and 

3 


e  =(-1,1)  and  require  that  v  e  {e  ,e  ,e  ,e  }.  With  this  restriction,  we  use 

4  M  112  3  4 

the  notation  <p  (x,y)  instead  of  d,,(x,y),  the  four  indices  n  ,n  ,n  ,n 

nnnn  V  1234 

12  3  4 


indicating  the  number  of  times  the  vectors 
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e  ,e  ,e  ,e 

1  2  3  4 


appear  in  V 


V 


respectively,  we  will  further  assume  that  nj>0  and  n2>0.  Finally,  if  the  set 
V  is  implicit  or  immaterial  we  simply  write  0(x,y)  for  the  box  spline. 

Roughly  speaking,  the  more  vectors  in  V,  the  smoother  the  box  spline  0. 
Specifically, 


(4)  0  e  CH  where  p  =  n  -  max(n  }  -  2  ,  n  =  Z  n 

1 

The  spline  space  V  is  formed  by  taking  linear  combinations  of  the 

integer  translates  of  0(x,y).  An  element  s(x,y)  €  if  is  given  by 

s(x,y)  =  y  c  0(x-i,y-j) 
i ,  J  J 

The  space  contains  all  polynomials  of  degree  p+1.  An  important  property  of 

the  spline  space  is  that  £0(x-i,y-J)  a  1. 

i .  J 

Next  we  define  the  scaled  spline  space  !f  with  elements  s  (x,y)  given 

h  h 


<■>  V-*  •  I 

i  .J 

The  function  0yCs*^]  will  be  denoted  by  Bh(x,y|V),  and  dropping  the  V  for 

convenience,  we  may  write 

(6)  sh(x.y)  =  £  c  Bjx-lh.y-Jh) 

1,J  0*2 
for  elements  of  f  .  The  approximation  power  of  the  space  !f  is  0(h  ): 

h  h 

If  the  partial  derivatives  of  order  p+1  of  a  function  f(x,y)  are 
Lipschitz  continuous,  there  exist  functions  Sh  €  y  such  that 
If (x.y)-s  (x.y) |  =  0(hP+Z). 

h 

Details  on  the  above  results  can  be  found  in  the  survey  paper  [ H] . 

We  introduce  next  the  class  of  approximants  which  realize  the  optimal 
approximation  order  of  V  and  whose  convexity-preserving  properties  are  the 

h 

subject  of  this  paper.  The  results  which  follow  are  developed  in  detail  in 
[CD]  and  [CDR1]. 

We  denote  by  F  the  vector  of  data  values  (f  >.  Define  the  finite 
difference  operator  M  acting  on  data  vectors  as  follows: 

(MF)  =  Y  m  f 

lj  L  rs  l-r,J-s 
r  ,  s 


where 


,  _  f  1-0(0, 0 
U  ‘  t  -0(1. J 


0)  if  i=J=0 
J)  otherwise 


The  symmetry  of  0(x,y)  about  the  origin  and  the  fact  that  I0(i,j)  =  l  imply 
that  M  can  be  expressed  as  a  sum  of  central  second  difference  operators. 


V.V.V 


MMVl', 


The  family  of  approximants  q  (f)(x,y)  is  then  defined  as  follows: 


(7) 


q  (f)(x,y)  =  V  f  B  (x-ih.y-Jh) 
i ,  J 

q  (f )  (x,y)  =  J"  (F+MF)  B  (x-lh,y-Jh) 
i  L)  »J  h 

q  (f ) (x, y)  =  7  (F+MF+...+MkF)iiB  (x-lh.y-Jh) 

k  L  IJ  n 


i .  j 


q  (f ) (x, y)  =  7  (F+MF+.  . .  )  B  (x-ih,y-Jh)  =  lim  qu 

oo  I.*  i  J  n  k 


i .  j 

The  limit  required  to  calculate  q  will  exist  only  if  n  or  n  is  zero.  In 

oo  3  4 

this  case  q  is  the  unique  cardinal  interpolant  of  the  data. 

CO 

The  approximation  order  of  the  q^  is  as  follows: 

a)  If  2k+2  <  p+2  then  |f(x,yj-qk(f)(x,y) |=0(h  ),  provided  that 

2k  4-1 

f  €  C  and  the  partial  derivatives  of  f  order  2k+l  are  Lipschitz 


continuous. 


b)  If  2k+2  a  p+2  then  |f  (x,  y)-qk(f ) (x, y)  |=0(hp+Z  ),  provided  that 
f  e  C2p+1  and  the  partial  derivatives  of  f  order  2p+l  are  Lipschitz 
continuous. 

If  2k+2 

approximation  and  is  referred  to  as  a  quasi- interpolant  of  f,  unless  k=«  in 


a  p+2  then  q^CfHx.y)  provides  the  optimal  order  of 


which  case  q^  is  referred  to  as  the  interpolant . 

Our  results  in  Section  IV  concern  the  convexity  of  the  qfc. 

We  require  one  more  result,  due  to  Dahmen  and  Micchelli  [DM]: 


(8) 


If  V  and  V  are  two  sets  of  integer  vectors  with  V  cV  ,  then 
12  12 

if  y  c  B  (x-ih,y-Jh|V  )  is  convex,  so  is  \  c  B  (x-lh,y-jh|V  ). 

u  IJ  h  1  u  Ij  h  2 


1  .J 


i  ,  J 


III..  Methods  of  Analysis 

t 

In  the  analysis  of  convexity  of  the  q^  we  concern  ourselves  first  with 

the  C1  box  splines  <b  ,  4>  ,  and  <p  .  (In  the  latter  two  the  index  n  is 

zero  and  is  omitted. )  The  convexity  of  these  is  relatively  simple  to 

investigate  as  their  second  derivatives  are  piecewise  constant  in  the  first 

case  and  piecewise  linear  in  the  other  two.  We  obtain  for  each  the  Hessian 

matrix  H  (x,y)  of  an  arbitrary  s  (x,y)  in  terms  of  its  coefficients  c  . 

s  n  .  1 J 

This  is  effected  with  the  aid  of  the  formula  [H],vVi.  =  5  <pxr.  ,  ,  where  v  is 

t  V  v  V\{v> 

an  element  of  V,  7  is  the  gradient,  and  5  is  the  centered  difference 

operator  given  by  • )  =  <t>(  *+v/2)-0(  •-v/2) .  The  Hessian  matrices  thus 

obtained  characteristically  have  as  their  entries  a  second  difference  of  the 
coefficients  appropriate  to  the  second  derivative  being  calculated,  with 


i 

I 

* 

I 


i 

i 


t 
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perhaps  an  additional  third  order  difference.  For  example,  using  ^  we 

present  in  Figure  2  the  value  of  H^Cx.y)  at  the  point  (x, y)=( ih+h/2, Jh+h/2) 

inside  the  triangular  piece  shown  in  Figure  1  (on  this  triangular  piece, 

6  is  a  cubic  polynomial  and  the  entries  of  H  are  linear). 

221  ■ 


T 

h  (lh,  Jh) •' 

±  z. 


-h— >1 


-(lh+h/2,  Jh+h/2) 


Figure  1 


1  -2  i 
o — 0 o 


H  (ih+h/2, Jh+h/2)  =  . 

*  ti  o - o 


-l _ l 


Figure  2 

The  entries  in  the  matrix  of  Figure  2  are  schematics  for  difference 
operators  where  the  symbol  "q"  identifies  the  multiplier  of  c  Bind 
multipliers  of  adjacent  coefficients  are  shown  beside  the  symbols  "o", 

2  T 

Next,  the  idea  is  to  compare,  for  the  approximants  q,  ,  D  s  =u  Hu  with 

k  U  n  s 

r f  f 

D2f=uTH  u,  where  u  is  a  unit  vector  and  H  =  xx  xy  is  the  Hessian  of 
u  f  f  f  f 

L  Xy  yy  J 

f(x,y).  If  the  difference  is  less  than  e  (the  lower  bound  on  Duf),  we  can 

conclude  that  s  is  convex, 
h 

We  proceed  to  illustrate  the  comparison  method.  Again,  we  use  as  an 
example  the  case  ^  and  we  compare  H  ( lh+h/2,  jh+h/2)  at  the  upper  corner 
of  the  triangle  in  Figure  1  with  Hf(ih,Jh)  in  the  case  of 
s  (x,y)=q  (f)(x,y).  For  this  choice  of  s  we  have  c  =f  .  Consider  now  the 

h  0  h  1 J  i  J 

(1,1)  entry  of  H^.  We  have 

(9)  H<1,U(  ih+h/2,  Jh+h/2)  =  (  o— - o  )c  =  (  o— ^3 - o  )f , , 

s  1 J  1 J 

and  the  latter  second  difference  may  be  written  in  integral  form: 


„(i+i)h 


a  T  *  /  u 

0  )f  =  i*  I  (h-lt-ihl  )f  (t,  Jh)dt 
1 J  ,  e.  J  xx 

J  h  T  i-1  )h 


Comparing  this  quantity  with  ( lh,  Jh)  *  f^(lh,  Jh),  we  obtain 


H‘1,n(lh+h/2, Jh+h/2)  -  H*1’1’ (lh, Jh) 


hi 

h  \  1-1 


,( l+l)h 


(h-lt-lhl)  f  (t, Jh)-f  (lh 


)h 


,  Jh)jdt 


i  p(1+1)h  I 

(h-lt-lhl )L|t-lh|dt  =  hL/3 
ti  ti-Dh  1 

The  Inequality  was  obtained  by  applying  the  Lipschltz  condition  In  (3) 

satisfied  by  f  . 

XX 

A  similar  technique  Is  then  used  to  estimate  the  analogous  differences 

at  the  other  entries  of  the  Hessian  matrices  H  and  H  .  We  then  obtain  an 

s  f 

estimate  of  the  form 

(11)  |D*f(ih,jh)  -  D^qQ ( f ) ( 1  h+h/2 ,  jh+h/2 )  |  s  uT^  J  u 

where  the  entries  of  the  matrix  are  proportional  to  hL  and  we  showed  above 
that  a=hL/3.  If  we  denote  by  A  the  largest  eigenvalue  of  the  matrix,  then 
using  the  assumption  Duf(lh, jh)a  c,  we  have 

D^qQ(f ) ( 1 h+h/2, Jh+h/2)  s  c  ■  U  0  If  1  s  t, 

Since  A  has  the  form  hL/c  we  can  obtain  an  Inequality  of  the  form  h  s  ce/L 

as  a  sufficient  condition  for  convexity  of  qQ  In  the  triangle  of  Figure  1, 

locally  at  the  upper  corner.  Then  similar  estimates  must  be  made  for  the 

2 

other  two  corners;  these  estimates  are  adequate  as  DuqQ  Is  linear  In  the 
triangle  so  positivity  at  the  three  corners  implies  positivity  in  the 
interior. 

The  Investigation  of  sufficient  conditions  for  convexity  of  q^,  k  >  0, 

requires  estimates  of  the  size  of  powers  of  the  finite  difference  operator  M 

(which  Is  itself  a  sum  of  second  difference  operators)  applied  to  the 

Hessian  matrix  In  Figure  2  with  c  =f  These  estimates,  which  involve 

1J 

fourth  order  differences,  are  rather  complicated  so  we  will  not  go  into  the 

details  here,  although  manipulation  of  integral  representations  such  as  (10) 

and  application  of  the  Lipschltz  condition  Is  again  the  technique.  In  any 

case.  inequalities  of  the  form  (11)  can  be  obtained  when  sh=<lk.  where  the 

entries  of  the  matrix  are  each  proportional  to  hL,  this  in  turn  leading  to  a 

sufficient  condlton  of  the  form  h  s  cc/L  for  convexity  of  q^. 

The  above  technique  was  used  to  obtain  sufficient  conditions  for 

convexity  of  the  q  for  <p  ,  6  and  d>  (see  Table  1  in  Section  IV). 

^k  *1111  221  ~122 


Consider  now  a  smoother  spline  obtained  by  adding  vectors  to  the  vector  set 
V  of  one  of  these  three  fundamental  splines.  By  the  result  (8)  it  is 

sufficient  to  examine  the  Hessian  matrix  of  the  appropriate  fundamental 
spline.  Thus  for  instance,  in  examining  convexity  in  the  spline  space  based 

on  d>  (obtained  by  adding  the  vector  e  to  the  vector  set  V  of  6  ),  a 

222  3  ^  221 

sufficient  condition  for  convexity  is  that  uH  uaO,  where  the  Hessian  H 

S  8 

used  is  that  in  Figure  2.  The  coefficients  c  however,  are  those 

appropriate  to  the  approximants  based  on  through  the  finite  difference 
operator  M. 

IV.  Results 

We  assume  throughout  this  section  that  f(x,y)  satisfies  the  hypotheses 

(3).  Table  1  below  provides  for  the  fundamental  splines  6  ,  6  and 

*1111  221 

$  ,  sufficient  conditions  on  h  guaranteeing  convexity  of  q  (f)(x,y). 

Spline 


k 

<p 

*221 

<P 

i22 

0 

mi 

0 

. 909  e/L 

. 480  e/L 

.727  e/L 

1 

.546  e/L 

.290  e/L 

.425  e/L 

1  <  k  <  « 

. 172  e/L 

.097  e/L 

( 1 . 50+ . 976k ) -1e/L 

*• 

ii 

8 

. 172  e/L 

. . 097  e/L 

- 

Table  1 


A  value  of  h  smaller  than  the  tabulated  figure  guarantees  convexity.  Since 
k=l  provides  the  quasi- interpolant  for  the  splines  considered  above,  the 
next  approximant  of  Interest  is  the  interpolant,  k  =  co,  for  which  an 


estimate  was  computed  in  the  cases  of  ,  and  #122;  however  this  estimate 

also  applies  to  intermediate  values  of  k.  Of  course  for  6  the 

nu 

approximant  for  k=«  does  not  apply. 

The  next  tables  give  sufficient  conditions  for  convexity  for  splines 
obtained  from  the  fundamental  ones  above  by  adding  one  or  more  of  the 
vectors  e  ,  e  ,  e  ,  e  to  V. 

1  2  3  4 

Three  parameters  related  to  a  box  spline  <f>  appear  in  the  table.  These 

are: 

a)  F  =  2#( i , J) ( i2+ j2) 1/2 

b)  #supp(#)  =  number  of  points  (i,j)  at  which  #(i,j)*0 

c)  a  =  1  -  min  •<  £  #(m,  n)exp(  ipm+iim)  >■  where  i  =  V^T,  (p,r)  e  IR2 

H,v  t  ■ , n  J 

For  box  splines  with  n3=0  or  n^=0,  a  <  1  always  holds.  For  the  other 


cases,  where  n^l  and  n^sl,  a  a  1  always  holds.  (We  assume  that  n^l  and  n^l 
in  all  the  cases  under  consideration.  )  For  ^  ,  as  well  as  in  several 

other  cases,  asl.  Although  it  is  not  proven  that  this  is  always  true  when  all 
four  vectors  appear  in  V  we  will  only  present  estimates  based  on  the 
assumption  a=l. 

The  Table  2  concerns  box  splines  obtained  from  6  with  n  =0. 

221  4 


k 

n  &2,  n  s2,  n  si,  n  =0 

12  3  4 

0 

(1. 10)-1e/L 

1 

(1. 10+2r)_1e/L 

2 

(1. 10+6r)-1e/L 

3  a  k  a  co 

fl. 10+  6r  +  2a  r{#supp(*)>1/2 

( 1-a)2 

-l 

c/L 

Table  2 


In  cases  where  n^=0  but  Table  2  does  not  apply  (i.e.  the  box  spline 
cannot  be  derived  from  the  bounds  of  Table  3,  which  apply  to  box 

splines  derived  from  $l22>  can  be  used. 


k 

n  =1  or  n  =1,  n  s2,  n  =0 

1  2  3  4 

0 

(2. 52)-1e/L 

1 

(2.52+4r)'le/L 

2 

(2.52+12r)"le/L 

3  a  k  a  » 

2. 52+12r  +  4a  (2~a)  r{#supp(0)>1/2 
(1-a)2 

•  • 

-l 

c/L 

Table  3 


For  box  splines  with  four  directions,  i.e.  box  splines  for  which  n^l 
and  n^sl,  as  stated  above,  we  assume  that  a=l.  If  the  box  spline  in  question 
can  be  derived  from  0  we  obtain  the  more  favorable  bounds  on  h  summarized 
in  Table  4.  The  first  three  entries  in  Table  4  are  the  same  as  in  Table  2 
since  these  are  derived  without  using  the  value  of  a  but  using  only  the  fact 
that  Z$(l,J)=l. 
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ABSTRACT .  Given  a  large  set  of  scattered  data  (x^,y^,f^),  a 
method  for  selecting  a  significantly  smaller  set  of  knot  points 
which  will  represent  the  larger  set  is  described.  The  algorithm 
for  selection  of  the  knot  point  locations  is  based  on  the 
minimization  of  the  sum  of  the  squares  of  the  difference  between 
the  average  number  of  points  per  Dirichlet  tile  and  the  actual 
number  of  points  in  each  tile,  subject  to  the  constraint  that 
each  knot  is  located  at  the  centroid  of  its  tile.  Using  the 
least  squares  Thin  Plate  Spline  approximation  method  for 
constructing  surfaces,  various  test  surfaces  are  examined  and 
compared  to  surfaces  obtained  using  the  smoothing  spline  and  the 
bicubic  Hermite  approximation  methods. 

I .  INTRODUCTION .  The  problem  of  fitting  a  surface  to  small 
sets  of  given  data  has  been  addressed  in  many  different  ways  and 
several  programs  are  currently  available  which  enable  one  to  deal 
with  the  problem  effectively.  The  methods  available  involve 
either  interpolation  or  approximation;  solving  the  interpolation 
problem  involves  a  system  of  equations  with  an  equivalent  number 
of  unknowns.  For  very  large  sets  of  data,  the  problem  is 
computationally  intractable.  This  consideration  provides  the 
motivation  behind  the  development  of  a  way  to  pare  the  problem 
down  to  a  more  manageable  size. 

We  wish  to  construct  a  function  F  which  approximately  fits 
the  data  since  we  assume  the  data  collection  is  subject  to 
measurement  error.  We  propose  to  use  approximation  by  least 
squares  Thin  Plate  Splines  (TPS),  where  the  surface  function  is 
constructed  so  as  to  minimize  an  error  function  subject  to 
certain  constraints.  Solving  the  approximation  problem  will  also 
involve  as  many  equations  as  there  are  data  points,  but  the 
number  of  unknowns  will  be  significantly  fewer.  Part  of  the 
appeal  of  TPS  approximation  lies  in  the  fact  that  it  minimizes  a 
certain  linear  functional,  and  involves  a  linear  combination  of 
functions  with  no  greater  complexity  than  the  natural  logarithm 
of  the  distance  function. 


*  The  work  of  the  second  author 
Office  of  Naval  Research  under  P 
NO.  BR033-02-WH. 
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Interpolation  of  scattered  data  by  the  method  of  TPS  was 
developed  from  engineering  considerations  by  Harder  and  Desmai- 
ris.  [1]  It  can  be  thought  of  as  a  two  dimensional 
generalization  of  a  cubic  spline,  which  models  a  thin  beam  under 
point  loads  subject  to  equilibrium  constraints.  The  TPS  function 
is  derived  from  a  differential  equation  which  gives  the 
deformation  of  an  infinite,  thin  plate  under  the  influence  of 
point  loads.  A  point  load  is  applied  at  each  data  point  so  that 
the  interpolating  surface  can  be  constructed  as  a  sum  of 
fundamental  solutions  of  the  TPS  function. 

A  relationship  between  the  basis  functions  which  span  a 
certain  higher  dimensional  function  space  and  the  data  exists  as 
seen  in  the  one  dimensional  analogue  found  in  cubic  spline 
interpolation.  The  term  knot  refers  to  the  places  at  which  two 
adjacent  cubic  polynomials  are  joined.  The  particular  set  of 
basis  functions  in  a  cubic  spline  interpolation  depend  on  the 
knot  points,  as  well  as  the  data  points.  However,  in 
approximation,  the  data  points  and  the  knot  points  may  not 
necessarily  coincide.  Furthermore,  the  particular  basis 
functions  found  in  approximation  may  easily  depend  on  the  knot 
points,  and  the  data  points  as  well.  In  using  the  least  squares 
TPS  approximation  method  to  fit  the  surface,  a  fewer  number  of 
basis  functions  than  the  number  of  given  data  points  is  employed. 
These  basis  functions  are  centered  at  a  different,  smaller  set  of 
points:  the  knots.  Therefore,  the  problem  at  hand  is  one  of 
selecting  the  knot  points,  and  hence  the  basis  functions. 

This  approach  differs  from  the  use  of  smoothing  splines, 
which  were  introduced  by  Wahba  and  Wendelberger  [3]  in  the 
multidimensional  case,  and  called  Laplacian  Smoothing  Splines 
(LSS).  LSS  minimize  a  certain  functional  which  is  a  linear 
combination  of  a  term  measuring  fidelity  to  the  data  and  one 
measuring  smoothness  of  the  function  (a  generalization  of  the 
usual  thin  plate  spline  functional).  In  this  case,  there  is 
still  one  basis  function  for  each  data  point,  but  the 
interpolation  condition  is  relaxed. 

Given  a  'large'  set  of  data  points,  (x^y^,^),  i  =  1,...,N, 
we  wish  to  find  a  smaller  set  of  knot  points,  (x^,yj),  j  = 
1 , . . . , K ,  which  will  'represent'  the  former  reasonably  well.  This 
could  be  accomplished  by  choosing  a  subset  of  the  original  set, 
or  by  some  process  which  produces  a  representative  set.  The 
ultimate  goal  is  to  approximate  the  surface  from  which  the  origi¬ 
nal  data  arose  using  the  representative  set.  Hence,  a  surface 
fit  to  the  large  set  and  one  fit  to  the  representative  set  would 
essentially  be  the  same. 

Approximation  by  least  squares  TPS  is  straightforward.  We 
construct  the  TPS  function 


F ( x , y )  =  Z  A-jd.:2log(d-: )  +  ax  +  by  +  c 

j=l  J  J  J 
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where  dj2  =  (x-Xj)2  +  (y-yj)2  and  the  coefficients  are 

chosen  tJo  minimize  Jthe  error  function  J 


E  =  I  { [ F( x± ,y± )  -  f i ]  /  Si}2  . 
i  =  l 

The  ordinates,  f^,  may  be  subject  to  random  errors,  say 
with  standard  deviation,  s^,  at  the  ith  data  point.  We  model  the 
plate  under  the  point  loads  at  the  knot  points  (as  opposed  to  the 
data  points);  therefore  the  constraint  equations  for  the  TPS 
method,  which  may  be  thought  of  as  'equilibrium  conditions'  on  the 
plate,  should  be  satisfied.  Thus,  the  error  function  is 
minimized  subject  to  the  constraint  equations: 

K  K  K 

I  =  0,  Z  A^x-:  =0,  £  A-iVi  = 

j=l  J  j=l  J  J  j=l  J  J 

Attempts  have  been  made  to  minimize  the  error  function  by 
considering  it  to  be  a  function  of  the  knot  point  locations  as 
well  as  the  coefficients,  wherein  a  total  of  3K  parameters  are 
involved.  As  reported  on  by  Schmidt  [2],  the  initial  knot  config¬ 
uration  was  taken  to  be  of  tensor  product  form.  The  overall 
minimization  process  is  a  large  non-linear  one,  and  is  compli¬ 
cated  by  possible  coalescense  of  knots  as  well  as  non-unique 
solutions  (as  indicated  by  consideration  of  one-dimensional 
cases).  Also,  the  objective  function  may  have  many  local  minima 
so  that  avoiding  poor  local  minima  or  searching  for  better  local 
minima  may  be  necessary.  Because  of  these  kinds  of  problems,  our 
goal  is  to  decouple  the  knot  selection  process  from  the  least 
squares  process. 

When  data  are  somewhat  uniformly  distributed,  methods  invol¬ 
ving  tensor  product  cubic  splines  may  be  desirable.  Tensor  pro¬ 
duct  methods  place  knot  locations  on  a  grid,  which  does  not 
necessarily  reflect  the  actual  disposition  of  the  data  points; 
in  fact,  there  could  be  no  data  nearby.  Even  though  these 
problems  are  surmountable,  they  could  lead  to  non-uniqueness  of 
solutions  and  a  minimum  norm  solution  that  is  not  aesthetically 
appealing . 

A  different  point  of  view  is  considered  here  where  the  knot 
point  locations  are  predetermined  based  on  two  criteria. 
Specifically,  we  shall  make  assumptios  relating  the  density  of 
data  to  the  dependent  variable  and  mandating  the  importance  of 
each  individual  data  point.  Solution  of  the  ove r det e r mi ned 
system  of  equations  follows  the  knot  point  selection.  A  summary 
of  the  approach  and  its  results  will  be  presented.  Examples  are 
given  which  illustrate  rather  well  the  ability  of  the  scheme  to 
select  knot  locations  which  reflect  the  underlying  density  of  the 
data.  Actual  surface  fitting  and  comparison  with  two  other 
methods,  the  Laplacian  smoothing  splines  of  Wahba  and 
Wendelberger  [3],  and  the  tensor  product  bicubic  Hermite  method 
due  to  Foley  [4],  are  also  reported  on. 


II.  THE  KNOT  SELECTION  PROCESS.  Given  ’a  priori’  flexibilty  in 
knot  placement,  the  problem  becomes  the  selection  of  knot 
location,  followed  by  solution  of  the  system  by  least  squares. 
Since  the  selection  of  knot  location  is  to  be  decoupled  from  the 
solution  of  the  least  squares  problem,  some  assumptions  must  be 
made  in  order  to  develop  an  algorithm  for  the  knot  selection 
process . 

First,  we  assume  that  the  independent  variable  data 
reflects  something  about  the  behavior  of  the  dependent  variable. 
For  example,  the  density  of  the  data  points  may  be  dependent  on 
the  curvature  of  the  surface.  Hence,  where  relatively  many  data 
points  are  found,  the  function  is  assumed  to  be  changing  behavior 
rapidly,  whereas  a  low  density  of  data  indicates  slowly  changing 
behavior.  Although  this  assumption  is  not  universally  satisfied 
in  practice,  it  does  not  seem  unreasonable  one. 

The  second  assumption  is  that  each  data  point  is  equally 
important  in  defining  the  underlying  surface.  Therefore  the 
number  of  data  points  represented  by  each  knot  should  be  the  same 
or  nearly  the  same.  This  leads  to  'equal  representation'  of  the 
data  points  by  the  knot  points  where  each  data  point  is  'close' 
to  a  knot  point.  A  key  advantage  is  achieved  in  pursuing  this 
approach  in  the  form  of  a  natural  heuristic  for  moving  the  knots 
around  the  plane  in  searching  for  the  optimal  knot  configuration. 
This  point  will  be  elaborated  on  later  in  the  paper. 

Our  knot  selection  algorithm  is  based  on  these  last  two 
assumptions.  First,  we  wish  to  minimize  the  sum  of  the  distances 
squared  from  each  data  point  to  the  nearest  knot  point;  that  is, 
minimize  the  'global*  value. 


GN2  =  £ 


£  min  [(Xi-xj)*  +  ( y i-y-; )  ]  . 
i  =  l  j 


This  is  global  in  the  sense  that  it  accounts  for  the  contribu¬ 
tions  from  all  of  the  K  tiles.  The  expression  leads  naturally 
to  a  'default'  Dirichlet  Tesselation,  a  partitioning  of  the  plane 
with  respect  to  the  knot  points  (Figure  1.1).  Thus,  each  data 
point  belongs  to  some  knot  point  according  to  the  Dirichlet  tile 
in  which  it  lies.  Data  points  on  any  of  the  tile  boundaries 
(ties)  must  be  resolved  by  a  determination  of  which  tile  they 
belong  to  or  some  sharing  mechanism. 


Differentiation  of  GN2  with  respect  to  Xj  and  y.:  show  that 
at  the  minimum,  each  knot  point  will  occupy  tne  centroid  of  its 
tile  with  respect  to  the  data  points  inside  that  tile.  Given 
some  initial  configuration  of  knot  points  with  its  default 
Dirichlet  Tesselation  (our  initial  guess  for  the  initial 
configuration  was  taken  to  be  quas i-gr idded ) ,  the  following 
algorithm  for  iteration  to  a  local  minimum  GN2  value  is  employed: 
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Figure  1.1  A  Dirichlet  Tesselation  with  5  Tiles.  It  is 
constructed  by  connecting  the  perpendicular  bisectors  to  the  lines 
joining  each  of  the  knot  points. 

(a)  compute  the  centroid  of  each  tile  with  respect  to  the 
data  points  contained  within  each  tile; 

(b)  move  the  knots  to  the  corresponding  centroids,  which 
results  in  a  new  Dirichlet  Tesselation  and  a  new  set  of  knot 
point  -  data  point  associations;  this  is  the  configuration  for 
the  next  iteration. 

(c)  quit  when  two  successive  iterations  yield  the  same  knot 
locations,  which  means  that  a  minimum  global  value  of  GN2  has  been 
found. 

This  algorithm  was  formulated  in  discussions  at  the  Istituto  per 
le  Applicazioni  della  Matematica  e  dell ' Informaica  in  1983  [5], 
after  the  problem  was  posed  by  G.  Nielson  and  R.  Franke. 

We  note  that  the  value  of  GN2  will  necessarily  decrease  as 
the  iterations  continue,  until  two  successive  iterations  yield 
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the  same  configuration;  this  will  be  proven  below.  In  the  case 
where  no  data  points  lie  in  a  tile  for  some  knot  point,  the  knot 
point  is  moved  to  the  nearest  data  point.  This  mechanism  avoids 
knots  without  data  points.  Futhermore,  if  a  data  point  lies  on  a 
tile  boundary,  it  is  assigned  to  the  knot  with  the  smallest 
subscript  (amongst  the  appropriate  choices  of  knot  points). 
Employment  of  a  different  criterion  for  the  resolution  of  ties 
will  yield  different  results.  We  note  that  knots  cannot  coalesce. 

The  following  theorem  is  pertinent  to  this  algorithm. 

THEOREM;  The  function  GN2  decreases  with  each  iteration  which 
involves  movement  of  a  knot  point. 

PROOF  [5]:  Write  GN2  in  the  more  convenient  form 


GN2  =  £  2  ((Xj-x,)2  +  ( yi~y  i ) 2  ]  (1) 

j=l  i€I j 

where  Ij  =  {i:  (x^,y^)  belongs  to  (Xj,yj)}.  In  (l),the 

interiorJsum  is  the  sum  of  the  distances  from  the  data  points  in 
a  tile  to  the  knot  point  in  that  tile,  and  the  exterior  sum  is 
over  all  K  of  the  tiles.  Let  a  prime  denote  the  new  knot  points 
and  index  sets.  This  form  leads  to  the  expressions, 


(*'  -i/y’-i)  =  (  E  xi/P-i'  Yi/Pi)' 
i€Ij  i  Ij 


where  p.:  is  the  number  of  indices  in  each  set  Ij.  The  set 

Ij  contains  the  indices  for  the  data  points  in  the  tile  for  the  jth 
knot  point.  The  new  knot  points  will  lead  to  a  new  tesselation, 
followed  by  the  new  index  sets  I -i  1  .  Then  the  expression  (1)  is 

greater  than  or  equal  to 


E  E  [(xj-x'j)2  +  (yi-y'-;)2]  (2) 

j  =  l  i€Ij  J 

because  the  new  knot  point  locations  minimize  the  contribution  of 
the  interior  sums.  This  expression  (2),  in  turn,  is  greater  than 
or  equal  to 

K 

£  £  [(Xj-x'j)2  +  (yi-y’-j)2]  (3) 

j=l  i€l '  j 

since  an  index  i  moves  to  another  set  only  in  the  case  wherein 
the  corresponding  data  point  is  now  closer  to  a  different  knot 
point,  thus  decreasing  its  contribution  to  the  global  GN2  value. 

Finding  a  local  minimum  of  GN  is  well-served  by  this  algo¬ 
rithm;  however,  as  seen  next  in  a  one  dimensional  example,  the 
function  GN2  is  rife  with  local  minima,  and  the  local  minimum 
value  found  depends  on  the  initial  configuration  of  knots  used. 

We  can  draw  similar  conclusions  for  the  multi-dimensional  case 
based  on  the  one  dimensional  analogy. 
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Figure  1.2  One  Dim 
variable  in  each  of 
as  a  function  of  t 
point  in  each  fiauri 


KNOT  FIXED  AT  2.0 


-5.0  -4.0  -3.0  -2.0  -1.0  0.0  1.0  2.0  3.0 


Figure  1.5  One  Dimensional  Cross-section  of  GN2.  Observe  the 
phenomenon  called  cascading  where  the  variable  knot  point  seeks 
that  location  which  minimizes  the  GN2  function  value,  subject  to 
it  being  within  the  domain  of  the  quadratic. 


Tables  I  and  II  summarize  two  possible  knot  movement 
scenarios  given  the  same  initial  guess  for  the  knot  point  loca¬ 
tions.  The  scenarios  differ  in  that  the  first  one  employs  the 
specific  criterion  for  breaking  ties  wherein  the  data  point  is 
assigned  to  the  knot  with  the  smallest  subscript.  The  other 
scenario  employs  an  alternative  tie-breaking  scheme.  For  a  fixed 
set  of  data  points,  the  initial  guess,  which  can  be  generated  by 
the  program  or  provided  by  the  user,  leads  directly  to  the 
assignment  of  the  data  to  the  closest  knot  point.  This  is 
followed  by  the  determination  of  the  new  knot  location  via  the 
centroid  criterion.  The  process  continues  until  stabilization 
occurs.  Used  in  conjunction  with  the  Figures  1.2  through  1.5, 
these  tables  yield  valuable  insight  into  how  the  one  dimensional 
case  works,  and  lend  themselves  to  understanding  the 
multidimensional  case. 


TABLE  I 


KNOT 

ITERATION 

POINT 

X1 

DATA  POINT 
ASSIGNMENT 

KNOT  POINT 

x2 

DATA  POINT 
ASSIGNMENT 

SEE 

FIG. 

0 

( Initial 
Guess ) 

-0.75 

{-1,-0. 5} 

0.5 

{0,0.33,2} 

1.2 

1 

(New  Knot) 

-0.75 

{-1,-0. 5,0} 

0.78 

{0.33,2} 

1.3 

2 

(New  Knot) 

-0.5 

{-1,-0. 5,0} 

1.167 

{0.33,2} 

1.4 

3 

(New  Knot) 

-0.5 

{-1,-0. 5,0} 

1.167 

{0.33,2} 

1.4 

STABILIZATION 


Table  I.  Trial  one  employs  the  tie  breaking  criterion  described 
earlier  where  the  data  point  is  assigned  to  the  knot  point  with 
the  smallest  subscript.  Both  tables  are  read  in  zig-zag  fashion 
following  the  flow  of  each  iteration  separately,  but  in  tandem 
with  the  other  knot  point  assignments. 

TABLE  II 


KNOT 

POINT 

DATA  POINT  KNOT 

’  POINT 

DATA  POINT 

SEE 

ITERATION 

X1 

ASSIGNMENT 

X2 

ASSIGNMENT 

FIG 

0 

( Initial 
Guess ) 

-0.75 

{-1,-0. 5} 

0.5 

{0,0.33,2} 

1.2 

1 

(New  Knot) 

-0.75 

{-1,-0. 5,0} 

0.78 

{0.33,2} 

1.3 

2 

(New  Knot) 

-0.5 

{-1,-0. 5,0} 

1.167 

{0.33,2} 

1.4 

3 

(New  Knot) 

-0.5 

{-1,-0. 5, 0,0. 33} 

1.167 

{2} 

1.4 

4 

(New  Knot) 

-0.292 

{-1,-0. 5, 0,0. 33} 

2.0 

{2} 

1.5 

5 

(New  Knot) 

-0.292 

{-1,-0. 5, 0,0. 33} 

2.0 

{2} 

1.5 

STABILIZATION 


Table  II.  In  trial  two,  the  tie  is  resolved  differently,  so  that 
an  alternative  data  point  assignment  is  made  at  iteration  3.  Hence, 
the  final  outcome  is  significantly  different.  Note  how  each 
iteration  is  referenced  to  one  of  the  Figures  1.2  through  1.5. 
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The  algorithm  for  finding  the  minimum  local  value  of  GN* 
performs  inconsistently  as  seen  in  the  cascading  phenomenon 
wherein  the  GN2  function  may  have  several  local  minimum  values. 
We  are  lead  to  consideration  of  a  somewhat  different  criterion 
for  locating  the  best  configuration  of  knot  points.  We  wish  to 
exploit  the  second  assumption  specified  earLier,  while  still 
taking  advantage  of  the  minimization  of  the  GN2  function. 

Since  each  data  point  is  assumed  to  be  equally  important, 
the  Dirichlet  tile  for  each  knot  should  contain  about  the  same 
number  of  data  points.  Thus,  we  wish  to  minimize  the  sum  of  the 
squares  of  the  differences  between  the  number  of  knots  in  each 
tile  and  the  average  number  that  should  belong  to  each  tile; 
that  is,  minimize  the  quantity 

K  2 
D  =  I  (N.:  -  N/K)2  , 

j  =  l  J 

where  N-:  is  the  number  of  data  points  in  the  jth  tile.  The  new 
algorithm  for  determining  knot  locations  is  based  on  the  minimi¬ 
zation  of  D,  subject  to  the  constraint  that  each  knot  be  located 
at  the  centroid  of  its  tile. 


This  new  optimization  leads  to  a  natural  heuristic  for 
moving  knots  from  a  stable  configuration  to  a  possibly  better 
configuration.  We  call  the  current  configuration  of  knots  a  base 
configuration,  and  iterate  through  the  algorithm  as  follows; 

(a)  generate  a  new  guess  for  the  knot  locations  by  moving  the 
knot(s)  with  the  smallest  number  of  data  points  in  their  tiles 
toward  the  knot(s)  with  the  largest  number  of  knot(s)  in  their 
tile;  the  distance  moved  is  initally  a  large  fraction  of  the 
total  distance  between  the  knots. 

(b)  iterate  to  a  stable  configuration  using  the  first 
algorithm,  compute  the  values  of  GN2  and  D,  and  compare  D  to  the 
smallest  value  achieved  to  date,  as  represented  by  that  of  the 
base  configuration; 

(c)  repeat  the  process  above  when  a  smaller  value  of  GN  is  ob¬ 
tained  with  the  present  configuration  as  the  base  configuration; 

(d)  when  a  smaller  value  of  GN2  is  not  found,  take  a  shorter 
step  in  the  movement  of  the  knot(s)  and  repeat  the  process  above; 

(e)  continue  with  smaller  and  smaller  steps  until  a  smaller 
value  of  D  is  found  (or  an  equal  value  of  D  with  a  smaller  GN2 
value)  until  the  knot  locations  return  to  the  base  configuration; 

(f)  perform  the  search  in  the  symmetrical  way  when  the  base 
configuration  is  returned  to;  that  is,  move  the  knot(s)  with  the 
largest  number  of  data  points  in  their  tile(s)  toward  the  knot(s) 
with  the  smallest  number  of  knots  in  their  tile(s); 

(g)  quit  when  no  smaller  value  of  D  is  found. 


The  movement  of  the  knots  is  justified  by  the  rationale  that 
a  more  equitable  distribution  of  data  points  can  be  found  by 
moving  the  tile  boundaries  across  data  points.  Note  that  the 
first  algorithm  for  computing  the  GN2  function  value  is  embedded 
in  this  new  algorithm. 
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Once  the  knot  point  locations  have  been  determined,  the 
least  squares  problem  is  solved  using  the  public  domain  software, 
LINPACK  [6].  We  call  the  N+3  x  K+3  matrix  of  equations  A,  the 
K+3  x  1  column  matrix  of  unknown  coefficients  x,  and  the 
N+3  x  1  column  matrix  of  dependent  variables  b,  so  that  the  least 
squares  problem  can  be  posed  as  solution  of  the  system  Ax=b.  The 
LINPACK  subroutines  employ  a  QR  decomposition  of  matrix  A,  which 
can  then  be  written  as  QRx=b.  Multiplication  by  QT  yields  Rx=QTb 
since  QTQ=I.  R  is  a  rectangular  matrix  with  dimensions 
N+3  x  K+3,  which  is  zero  below  its  main  diagonal,  so 
that  multiplication  by  the  block  matrix  yields  the  result 

x=R-j^  1QTb.  Thus,  the  computation  of  x  requires  only  the  matrix- 
vector  multiplication  QTb,  followed  by  back  substitution  in  the 
triangular  system  R^1x=QTb.  Using  a  Householder  algorithm  for 
the  QR  decomposition,  numerical  stability  is  guaranteed. 
Finally,  with  the  known  coefficients  in  hand,  a  grid  of  surface 
values  can  be  computed,  which  can  be  subsequently  used  by  a  user 
provided  plotting  routine  to  generate  a  plot  of  the  surface. 


Ill .  RESULTS  AND  EXAMPLES .  Using  the  least  squares  algorithm 
for  the  a  priori  selection  of  the  knot  point  locations, 
experiments  were  conducted  to  test  the  scheme  using  different 
sets  of  test  data.  This  was  followed  by  verification  of  the 
scheme  on  a  large  set  of  real  data.  Results  from  two  sets  of 
the  test  data  are  presented  here:  one  consisting  of  200  data 

points  called  'Cliff',  and  one  consisting  of  500  data  points 
called  'Humps  and  Dips'.  Both  sets  of  data  were  generated  using 
known  functions  (see  Franket  7 ]  )  in  a  way  that  forced  the 
disposition  of  points  to  be  proportional  to  the  curvature  of  the 
sampled  function.  Figures  1.6  through  1.11  portray  the  test  data 
sets  graphically,  and  illustrate  the  optimized  knot  point 
configurations  found  using  the  least  squares  algorithm.  Figure 
1.11  is  particularly  encouraging,  since  it  depicts  actual 
hydrographic  data  collected  in  Monterey  Bay.  We  note  that  the 
assumption  regarding  the  density  of  the  data  being  indicative  of 
the  behavior  of  the  dependent  variable  is  not  actually  satisfied 
in  this  case  due  to  the  source  of  the  data.  Nonetheless,  these 
results  demonstrate  the  ability  of  the  algorithm  to  produce 
representative  sets  of  knots. 

We  also  investigated  how  closely  the  constructed  surface  F 
and  the  'true'  surface  resemble  one  another.  This  comparison  is 
made  in  the  context  of  the  root-mean-squared  error  (RMS)  of  both 
the  residuals  (at  the  data  points)  and  on  a  rectangular  grid  of 
locations  in  the  plane.  Tables  III  and  IV  provide  a  comparison 
of  the  RMS  error  for  the  test  data  sets  using  the  least  squares 
algorithm  developed  here,  the  method  of  Wahba  and  Wendelberger 
[3],  and  the  method  of  Foley  [4].  The  dependent  variables  of 
the  experimental  data  sets  were  generated  in  two  ways:  1)  using 
a  known  function,  and  2)  contaminating  it  by  the  injection  of 
independent,  normally  distributed  random  errors  with  a  composite 
standard  deviation  of  less  than  0.05.  The  actual  standard 
deviation  was  about  0.0485. 
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Figures  1.6  and  1.7.  The  'Cliff'  data  set.  Note  the  relatively 
dense  disposition  of  data  points  across  the  diagonal  where  the 
underlying  surface  drops  off.  The  25  knot  points  used  clearly 
reflect  the  behavior  of  the  data  set.  as  pynprt-eH. 
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Figures  1.8  and  1.9  The  'Humps  and  Dips'  data  set.  Note  how 
clumps  of  data  appear  in  three  portions  of  the  plane,  indicating 
that  the  underlying  surface  is  undergoing  change.  A  set  of  50 
knot  points  was  used  to  represent  the  data. 
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TABLE  III 


COMPARISON  OF  RMS  ERRORS  ON  ’CLIFF'  200  POINTS 


METHOD 

NUMBER  OF 

NO  ERRORS 

CONTAMINATED 

DATA 

DATA  POINTS/ 
KNOT  POINTS 

IN  DATA 

RESIDUAL 

GRID 

RESIDUAL 

GRID 

LSTPS 

200/20 

.01562 

.01474 

.05214 

.01795 

LSTPS 

200/25 

.01179 

.01154 

.04805 

.02040 

FOLEY 

200/5X5 

.00777 

.00613 

.05996 

.04819 

LSTPS 

200/35 

.00626 

.00616 

.04590 

.02146 

FOLEY 

200/6X6 

.00512 

.00417 

.05113 

.03745 

SMOOTHING 

200 

0.0 

.00096 

.04272 

.01806 

TABLE  IV 


COMPARISON  OF  RMS  ERRORS  ON  'HUMPS  &  DIPS'  500  POINTS 


METHOD 

NUMBER  OF 
DATA  POINTS/ 

NO  ERRORS 

IN  DATA 

CONTAMINATED 

DATA 

RESIDUAL 

GRID 

RESIDUAL 

GRID 

LSTPS 

500/20 

.02402 

.02517 

.05256 

.02738 

LSTPS 

500/25 

.01664 

.01766 

.04818 

.02283 

FOLEY 

500/5X5 

.01346 

.01230 

.05844 

.03767 

LSTPS 

500/50 

.00645 

.  00845 

.04544 

.01961 

FOLEY 

500/7X7 

.00645 

.00552 

.05696 

.04864 

Tables  III  and  IV.  Comparison  of  RMS  errors  in  two  sets  of  data, 
each  with  an  exact  and  a  contaminated  version,  using  three 
methods  for  surface  construction. 
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In  the  first  case,  we  would  expect  to  see  an  overall 
decrease  in  the  RMS  error  on  both  the  residuals  and  the  grid  as 
the  number  of  knot  points  used  to  represent  the  data  is 
increased.  In  the  contaminated  case,  the  dependent  variable  at 
each  data  point  is  the  sum  of  the  unknown  underlying  function 
value  and  the  error  function  value  so  that  the  difference  between 
the  constructed  surface  and  the  'true'  surface  is  entirely 
attributable  to  the  presence  of  error  in  the  data.  Thus,  we 
expect  the  RMS  error  in  the  residuals  to  match  the  composite 
standard  deviation  of  the  random  error  injected  into  the 
contaminated  data.  At  the  grid  points,  we  expect  the  RMS  error 
to  be  smaller  than  the  composite  standard  deviation,  since  the 
grid  sample  is  larger  (33x33)  and  the  errors  are  distributed  more 
evenly  throughout  the  entire  region  of  interest.  In  the  case 
where  no  error  is  present,  we  expect  that  the  difference  between 
the  constructed  surface  and  the  'true'  surface  is  entirely  due  to 
'slack'  in  the  constructed  surface.  We  anticipate  that  the  RMS 
error  in  the  residuals  would  be  approximately  equal  to  the  RMS 
error  on  the  grid,  thereby  giving  evidence  that  the  error  in  the 
constructed  surface  is  uniformly  distributed  over  the  entire 
region  of  interest. 

Some  observations  can  be  made  regarding  Tables  III  and  IV. 
The  general  trend  of  the  RMS  error  on  both  the  residuals  and  the 
grid  is  to  decrease  as  the  number  of  knot  points  is  increased. 
As  expected  with  the  exact  data,  the  RMS  error  of  the  residuals 
and  the  RMS  error  on  the  grid  are  roughly  equivalent.  For  the 
contaminated  data,  the  RMS  error  of  the  residuals  roughly  matches 
the  composite  standard  deviation  of  the  data,  and  the  RMS  error 
on  the  grid  is  smaller  than  the  RMS  error  of  the  residuals,  as 
expected.  The  discrepancy  between  the  RMS  error  on  the  grid  and 
the  RMS  error  in  the  residuals  cannot  be  totally  attributed  to 
the  injected  error;  it  is  the  result  of  ’undersmoothing',  where 
the  constructed  surface  tends  to  fit  the  error  rather  than  the 
data . 

In  comparing  the  least  squares  to  the  smoothing  spline 
method  in  the  exact  data  case,  we  note  that  the  smoothing  spline 
method  yields  a  residual  RMS  error  of  0.  This  could  be  expected, 
since  there  is  no  error  in  the  data  and  the  spline  of 
interpolation  is  chosen.  On  the  grid,  the  RMS  error  is  small 
since  some  amount  of  error  on  the  grid  is  expected.  When  the 
data  not  contaminated,  the  RMS  error  of  the  least  squares  algo¬ 
rithm  begin  to  approach  those  of  the  smoothing  splines  method 
only  as  the  number  of  knots  used  becomes  large.  We  also  note 
that  in  the  500  data  point  set  (Humps  and  Dips),  no  comparison  is 
made  since  a  potential  limit  for  computing  smoothing  splines  is 
200-300  data  points. 

In  comparing  Foley's  method  to  the  least  squares  method  for 
the  contaminated  case,  the  RMS  error  on  the  residuals  is  nearly 
equal  to  the  composite  standard  deviation  injected  into  the  data. 
However,  on  the  grid,  the  least  squares  method  does  better,  an 
indication  that  smoothing  is  occurring,  as  expected.  We  also 
note  that  an  increase  in  the  number  of  grid  points  does  not 
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significantly  improve  the  RMS  error  in  Foley's  method,  even 
though  an  increase  in  the  number  of  knots  in  the  least  squares 
method  usually  yields  improved  results.  We  used  the  default 
local  approximations  in  Foley's  method,  and  we  note  that 
performance  of  the  method  may  be  improved  through  the  use  of 
lower  degree  local  approximations  to  estimate  the  grid  values  to 
be  used. 


Finally,  we  note  that  the  search  for  a  best  knot  configura¬ 
tion  can  turn  out  to  be  rather  expensive.  For  a  large  number  of 
data  points  with  a  moderately  large  number  of  knot  points,  the 
computational  effort  could  be  excessive,  althogh  we  are 
investigating  ways  of  speeding  up  the  algorithm.  Furthermore,  as 
we  noted  earlier,  the  end  results  are  dependent  on  the  initial 
guess,  although  they  generally  look  quite  good  for  any  reasonable 
initial  guess. 
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A  Rapid,  Backscatter  Simulation  Technique 
for  Complex  B-Spline  Target  Models 


Karl  D.  Reinig 

Advanced  Electronics  Systems  Laboratory 
Sensors  and  Signal  Processing  Technology  Division 
Harry  Diamond  Laboratories,  U.S.  Army  LABCOM 
2800  Powder  Mill  Rd,  Adelphi,  MD  20783 

ABSTRACT.  This  paper  describes  a  method  for  rapidly  evaluating  the 
simulated  radar  backscatter  signatures  of  B-spline  target  models  moving 
relative  to  a  source/receiver.  A  geometric  optics  approach  is  used  to  esti¬ 
mate  the  radar  return  from  a  complex  target  surface  described  by  a  bi-cubic 
B-spline  mesh.  The  method  exploits  the  second-order  continuity  of  bi-cubic 
B-spline  surfaces  to  reduce  the  problem  of  finding  all  the  specular  points 
associated  with  each  new  trajectory  position  to  that  of  tracking  the  motion 
of  existing  points.  In  particular,  it  is  shown  that  the  locations  of  the  an¬ 
nihilations  and  creations  of  specular  paths  may  be  predicted  for  an  entire 
trajectory,  eliminating  the  need  to  search  the  whole  surface  for  specular 
points  as  the  target  moves  relative  to  the  source/receiver.  The  method  is 
shown  to  work  for  the  multiple-bounce  case  as  well. 

1  Introduction 

Consider  the  scenario  depicted  in  figure  1.  A  source/receiver  (S/R)  moving 
along  some  trajectory  illuminates  a  target  of  interest.  It  is  desired  to  esti¬ 
mate  the  return  from  the  target  as  the  S/R  moves  along  the  trajectory.  No¬ 
tice  that  whether  figure  1  describes  a  target  detection/identification  prob¬ 
lem  or  the  terminal  phase  of  a  guided  munition  is  mostly  determined  by 
the  trajectory  being  considered.  A  backscatter  simulation  technique  which 
places  few  or  no  restrictions  on  the  paths  of  the  trajectories  to  be  simulated 
would  therefore  find  use  in  all  phases  of  seeker  munitions  studies.  In  addi¬ 
tion,  of  course,  the  relative  motion  between  the  S/R  and  the  target  could 
be  due  strictly  to  the  motion  of  the  target.  Thus  the  scenario  also  includes 
the  return  from  passing  targets. 
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Figure  1:  Target  Encounter  Scenario 

This  paper  discusses  the  use  of  a  geometric  optics  approach  to  simulate 
the  radar  return  from  complex  but  generally  smooth  targets.  The  overall 
simulation  method  can  be  broken  into  two  basic  parts.  First,  the  surface  of 
a  target  is  described  using  bi-variate  piecewise  polynomial  functions  such  as 
bi-cubic  tensor  product  B-spline  surfaces.  Second,  for  each  location  along 
any  desired  trajectory,  the  positions  on  the  surface  (the  specular  points) 
from  which  a  ray  leaving  the  source  would  be  reflected  back  to  the  receiver 
are  found  along  with  their  local  principal  radii  of  curvature.  The  locations 
of  the  specular  points  are  used  along  with  their  local  radii  of  curvature 
to  calculate  the  discrete  radar  cross  sections.  The  overall  return  from  the 
complex  target  is  then  found  by  coherently  summing  the  expected  discrete 
return  from  each  specular  point. 

Previous  studies  have  demonstrated  the  usefulness  of  the  geometric  op¬ 
tics  approach  for  computing  the  expected  return  from  a  composite  of  simple 
analytic  shapes  [l J .  However,  as  the  targets  of  interest  become  more  com¬ 
plex  or  the  desire  to  match  their  surfaces  more  accurately  increases,  the 
use  of  simple  analytical  shapes  to  describe  the  target  surface  often  becomes 
impractical.  Three-dimensional  faceted  models  exist  for  most  targets  of 
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interest,  including  tanks,  helicopters,  and  jet  aircraft.  These  faceted  mod¬ 
els  generally  use  several  thousand  facets  to  describe  the  target  surface  and 
contain  a  great  deal  of  detail.  Unfortunately,  the  faceted  models  do  not  di¬ 
rectly  give  useful  geometric  optics  information.  For  example,  both  principal 
radii  of  curvature  of  a  faceted  model  are  unbounded  everywhere  except  at 
facet  edges  where  they  are  undefined.  This  paper  focuses  on  a  technique 
which  exploits  the  second-order  continuity  of  bi-cubic  B-spline  surfaces  to 
reduce  the  problem  of  finding  all  the  specular  points  associated  with  each 
new  trajectory  position  to  that  of  tracking  the  motion  of  existing  points. 
For  a  complex  target,  the  reduction  in  the  problem  results  in  multiple  or¬ 
ders  of  magnitude  in  savings.  In  addition,  it  is  shown  that  the  technique 
can  be  easily  extended  to  the  multiple-bounce  case,  with  the  potential  for 
even  greater  savings. 


Single-Bounce  Return  Problem 


The  geometry  of  the  specular  return  problem  from  a  single  patch  of  an 
arbitrary  B-spline  surface  is  shown  in  figure  2.  R,(A)  is  the  current  position 


G(u,v,  A) 


Rf{u,v)l 


R<(A) 


Figure  2:  Specular  Return  Geometry 
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of  the  projectile  along  a  linear  trajectory.  R,(u,i>)  describes  the  target 
surface  as  a  function  of  the  two  parameters  u  and  v.  And  G(u,t;,A)  is 
the  difference  between  the  two  vectors  R{(A)  and  R,(u,v).  It  will  be 
assumed  throughout  this  paper  that  the  projectile  has  an  unobstructed 
view  of  the  surface  being  considered;  i.e.,  the  problem  of  shadowing  will 
not  be  addressed  here.  A  necessary  and  sufficient  condition  for  a  point  on 
the  surface  to  be  a  specular  point  relative  to  the  position  Rt(A)  is  that  the 
h  norm  of  G(u,v,  A)  be  either  a  local  maximum  or  minimum  with  respect 
to  the  two  target  surface  parameters  u  and  v.  Finding  all  the  specular 
points  of  a  given  surface  (for  a  given  trajectory  position)  is  therefore  the 
same  as  finding  all  u  and  v  which  satisfy  the  two  nonlinear  equations 


Fi(u,v,A)  = 


&  i|G(u,  t;,  A)| 


If  G(u,v)  is  given  by  a  tensor  product  of  cubic  B-splines  on  a  uniform 
grid,  both  Fi(u,v)  and  F2(u,v)  may  be  written  explicitly  in  terms  of  u 
and  v.  In  general  though,  solving  for  the  u  and  v  (call  them  u*  and  t;*) 
which  satisfy  equations  (l)  and  (2)  requires  a  numerical  technique.  Simple 
application  of  Newton’s  method  for  nonlinear  equations  will  find  solutions 
to  (1)  and  (2)  provided  the  search  is  begun  “close  enough”  to  (u*,v*).  The 
question  of  how  close  is  close  enough  is  a  complex  one,  which  ultimately 
depends  on  the  variation  of  the  surface  being  considered. 


3  Twinkles 


Suppose  the  coordinates  of  a  specular  point  are  known  for  a  particular 
value  of  A  and  we  wish  to  observe  the  motion  of  the  specular  point  as  A 
changes.  The  following  argument  is  a  trivial  extension  of  that  given  by 
Longuet-Higgins  for  the  case  of  a  time-varying  analytic  surface  [2].  Taking 


the  differential  of  equations  (1)  and  (2)  with  respect  to  u,  v,  and  A  yields 


/  d2  G  2  d2  jG  2 

du 

d2  G  2  ' 

du 2  dudv 

dA 

dud  A 

anew2  a2 ||G||2 

V  dudv  dv2  ^ 

dv 

.  dA  . 

a2\\af 

dvdA 

Denote  the  two-by-two  matrix  of  equation  (3)  as  J(u,t>,  A).  Assuming  the 
elements  J(u,u,A)  are  continuous  functions  of  u,  v,  and  A,  if  the  matrix 
is  nonsingular,  du/dA  and  dv/dA  will  both  be  finite,  which  implies  that 
the  changes  in  u  and  v  can  be  kept  as  small  as  desired  by  choosing  the 
change  in  A  small  enough.  Longuet-Higgins  [2]  refers  to  the  vanishing  of 
the  determinant  of  J(u,  v,  A)  as  a  twinkle.  The  physical  significance  of  this 
result  is  that  as  the  S/R  moves  across  a  second-order  continuous  surface, 
specular  points  cannot  suddenly  appear  or  disappear  unless  J(u,v,A)  is 
locally  singular.  The  observation  that  specular  points  move  in  continuous 
paths  broken  only  when  J(u,t>,  A)  is  singular  leads  directly  to  the  following 
conclusion.  If  for  any  given  trajectory,  it  were  possible  to  determine  all 
the  points  (u,v,A)  for  which  J(u,v,A)  is  singular,  it  would  no  longer 
be  necessary  to  search  the  entire  surface  for  specular  points  at  different 
positions  along  the  trajectory.  It  would  only  be  necessary  to  find  all  the 
specular  points  corresponding  to  one  trajectory  position  (Rn  for  example) 
and  then  track  their  motion,  picking  up  or  losing  specular  paths  only  at 
twinkles. 


4  Finding  Twinkles 


Let 


and 


Fi  = 


a’HGII’ 

du 


F2  = 
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Then  the  Newton  step  toward  the  parametric  coordinates  of  a  twinkle  must 
satisfy 


3F]  dFi  dF] 

HI 
$h  $k 

du  dv  dA 


Su  -Fi 

Sv  =  ~f2 

SA  -F3 


Thus,  local  search  techniques  exist  for  finding  all  potential  locations  on 
the  target  surface,  as  a  function  of  the  trajectory  position,  for  which  the 
specular  paths  are  discontinuous. 


5  Specular  Paths  Near  a  Twinkle 


By  themselves,  the  conditions  for  a  twinkle  do  not  tell  whether  a  particular 
twinkle  represents  a  birth  or  death  of  a  pair  of  specular  points  with  respect 
to  a  chosen  trajectory  direction  (e.g.,  time).  Once  again,  a  straightforward 
extension  of  the  analysis  by  Longuet-Higgins  [2]  gives  a  method  for  de¬ 
scribing  the  motion  of  specular  paths  near  a  twinkle,  including  whether  the 
twinkle  represents  the  birth  or  death  of  a  pair  of  paths.  Define 


di+3*k  ]|G(u,  v,  A)||: 
du'dvJd  Ak 


u=v=A=0 


where  the  coordinate  system  is  chosen  such  that  u  =  v  =  A  =  Oat  the 
twinkle  and 

OQOO  =  G100  =  Ooio  =  Olio  =  0200  =  0.  (5) 


It  can  be  easily  seen  from  Longuet-Higgins  analysis  that  near  a  twinkle  the 
u,  t;  coordinates  of  a  specular  point  are  given  by 
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Equations  (6)  show  that  if  a]01/a3oo  is  positive,  then  two  solutions  exist 
when  A  is  less  than  zero,  and  no  solutions  exist  when  A  is  greater  than 


zero;  i.e.,  an  annihilation  of  a  pair  of  specular  paths  occurs.  Similarly,  the 
creation  of  a  pair  of  specular  paths  occurs  when  0101/0300  is  negative.  It 
remains  to  determine  a  transformation  of  coordinates  for  which  equations 
(6)  hold.  Simply  choosing  the  origin  of  the  new  coordinate  system  to  be 
the  location  of  the  twinkle  ensures  that  a0oo  is  equal  to  zero.  It  is  useful  to 
consider  the  surface  formed  by  letting  the  function  ||G(u,v,  A)||2  be  the  w 
coordinate  in  an  orthogonal  u,v,w  coordinate  frame  as  shown  in  figure  3. 


Figure  3:  Distance  Surface 

For  lack  of  a  better  term,  this  surface  will  be  referred  to  as  the  “distance 
surface.”  The  condition  for  a  twinkle  may  then  be  interpreted  physically 
as  the  vanishing  of  the  Gaussian  curvature  of  the  distance  surface.  Or 
alternately  stated,  at  a  twinkle,  one  of  the  two  principal  radii  of  curvature 
of  the  distance  surface  is  equal  to  zero.  The  curvature  of  any  smooth 
surface,  at  a  given  point,  in  the  direction  <5u,  6v  may  be  written  as  (see, 
among  others,  Faux  and  Pratt  [3]) 

Kn  =  (6u)2a2oo  +  (6u)(6v)ano  +  (<5v)Jao2o*  (7) 

Now  suppose  a  rotation  of  coordinates  is  made  such  that  the  u  coordinate 
is  aligned  with  the  principal  radii  of  curvature  having  zero  value  (that  such 
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a  direction  exists  is  ensured  at  a  twinkle).  Then  with  6v  =  zero,  equation 
(7)  yields 

0  =  (6u)2a20o, 


which  implies  that  a2 oo  =  zero.  It  can  be  seen  that  the  twinkle  condition 
implies 


®200^020  ~  ano  -  0. 


Therefore,  in  the  rotated  coordinate  system,  ano  must  also  be  equal  to  zero 
and  equations  (5)  are  satisfied.  Define  the  new  coordinates  u\  v\  and  A' 
by 


u  =  u'cos#  -  i/sinfl  +  U|„ 
v  =  u'  sin  0  +  v'  cos  6  + 

A  =  A1  +  6tw, 


where  utw,  vtw,  and  A(u,  are  the  original  coordinates  of  the  twinkle.  Then 
if  6  is  the  angle  which  rotates  the  original  u  axis  into  the  direction  of  zero 
curvature,  the  signs  of  ajoi  and  a30o,  in  the  new  coordinates,  will  tell  if 
the  twinkle  represents  a  birth  or  a  death.  In  addition,  the  motion  of  the 
specular  points  in  the  vicinity  of  the  twinkle  (in  the  new  coordinates)  will 
be  given  by  equations  (6). 


6  Single-Bounce  Example 


Figures  4  and  5  show  an  example  of  the  use  of  twinkles  for  tracking  single¬ 
bounce  specular  paths  on  a  B-spline  surface.  Figure  4  shows  the  B-spline 
surface  control  mesh  and  desired  trajectory,  as  well  as  the  locations  along 
the  trajectory  where  specular  path  discontinuities  are  expected  (based  on  a 
search  for  twinkles).  Each  twinkle  location  along  the  trajectory  is  labeled 
B  or  D  depending  on  whether  the  twinkle  represents  the  birth  or  death 
of  a  pair  of  specular  paths,  and  lines  have  been  drawn  connecting  them 
with  their  associated  locations  on  the  target  surface.  In  addition,  the  pre¬ 
dicted  paths  of  the  specular  points  in  the  area  near  each  twinkle  are  shown. 
Figure  5  shows  the  results  after  tracking  the  the  specular  paths  for  1000 
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locations  along  the  trajectory.  A  comparison  of  figures  4  and  5  shows  that 
the  specular  paths  did  in  fact  remain  continuous  everywhere  except  at  the 
twinkles  and  moved  a s  predicted  in  the  regions  near  each  twinkle.  Because 
completely  searching  the  entire  target  for  specular  points  at  each  trajec¬ 
tory  location  was  unnecessary,  the  entire  simulation  took  only  a  couple  of 
minutes.  Even  if  the  global  search  for  specular  points  could  be  reduced  to 
10  seconds  per  trajectory  location,  the  simulation  would  have  taken  over 
2-1/2  hours  without  the  use  of  twinkles. 

7  Nth-Order  Specular  Points 

Often  multiple-bounce  return,  as  depicted  in  figure  6,  produces  a  significant 
contribution  to  the  overall  target  backscatter.  A  weak  form  of  Fermat’s 
principle  of  optics  guarantees  that  any  multiple-bounce  return  path  will 
be  stationary  with  respect  to  the  2n  surface  parameters  of  the  bounce.  In 
terms  of  the  distances  between  bounces,  this  becomes 

_0_ 

duj  dvj 

for  all  j  =  1,...,  n.  Noting  that 

dd,  _  ddt  _  Q 

duj  dvj 

whenever  >  -  1  <  j  or  j  >  i  +  2,  we  get  the  2n  conditions 

d(d,  +  di+i)  _  g  _  d(d,  +  dj+i) 
du ,  dvi 

Denote  dt  +  d,+1  by  G,.  Then,  taking  the  differential  of  the  previous  2n 
equations  with  respect  to  the  2n  surface  parameters  ( u,,Vi  i  =  1,  n)  and 
the  trajectory  parameter  A  yields 
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Figure  6:  Multiple-Bounce  Return 


The  same  argument  which  led  to  the  conclusion  that  single-bounce  spec¬ 
ular  paths  can  only  be  created  or  annihilated  at  twinkles  may  be  directly 
extended  to  include  multiple-bounce  twinkles;  i.e.,  the  vanishing  of  the 
determinant  of  the  2n  x  2n  Jacobian  of  equation  (8)  is  required  for  a  dis¬ 
continuous  motion  of  the  nth-order  specular  paths. 

8  Double-Bounce  Example 

Figures  7  and  8  show  an  example  of  the  use  of  double-bounce  twinkles 
for  tracking  double-bounce  specular  points  on  a  simple  B-spline  surface. 
Figure  7  shows  a  control  mesh  for  a  simple  crescent-shaped  ribbon  which 
has  been  tilted  slightly  so  it  may  be  viewed.  A  short  trajectory  is  shown 
along  with  the  only  double-bounce  twinkle  associated  with  that  trajectory 
and  surface.  In  addition,  a  triangle  has  been  drawn  to  show  the  double¬ 
bounce  path  associated  with  the  twinkle.  Figure  8  shows  the  results  of 
searching  the  surface  for  double-bounce  specular  points  at  40  locations 
along  the  trajectory.  As  expected,  the  number  of  double-bounce  specular 
points  associated  with  each  trajectory  location  before  the  twinkle  did  not 
change  (there  were  none).  At  the  twinkle,  two  sets  of  specular  paths  were 
created  (pairs  of  specular  bounces  associated  with  the  same  double-bounce 
path  are  shown  connected  by  a  line).  The  two  sets  of  double- bounce  paths 
moved  in  generally  opposing  directions  from  their  origin. 
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9  Conclusion  and  Near-Term  Efforts 


The  use  of  twinkles  to  reduce  the  problem  of  finding  specular  points  for 
many  locations  along  a  trajectory  to  that  of  tracking  their  motion  can 
greatly  reduce  the  computation  time  required  to  find  the  geometric  optics 
return  from  a  complex  surface.  Two  fundamental  problems  are  left  to  be 
solved  before  the  method  can  be  used  widely.  The  first  is  the  current 
lack  of  B-spline  models  which  describe  targets  of  interest.  While  spline 
modeling  of  complex  targets  is  not  expected  to  be  easy,  software  packages 
such  as  the  one  developed  by  the  Alpha.l  group  at  the  University  of  Utah 
offer  user-friendly  tools  which  should  allow  for  the  practical  development  of 
detailed  target  models.  It  is  expected  that  the  creation  of  rapid  and  robust 
algorithms  for  the  simulation  of  radar  backscatter  from  complex  B-spline 
surfaces  will  result  in  a  significant  demand  for  the  development  of  a  library 
of  B-spline  target  models  of  interest. 

The  second  basic  problem  is  the  local  gradient  search  technique  cur¬ 
rently  used  to  find  twinkles  and  initial  specular  points.  The  algorithms 
used  to  demonstrate  the  method  simply  start  searching  for  twinkles  in  the 
middle  of  each  patch  and  the  middle  of  the  trajectory.  The  algorithms  can 
find  only  one  twinkle  on  a  single  spline  patch  (there  may  be  more)  and 
are  not  even  assured  of  finding  a  twinkle  when  one  exists.  These  problems 
are  typical  of  local  gradient  search  techniques  when  no  additional  infor¬ 
mation  is  used  to  determine  the  areas  to  be  searched.  Fortunately,  there 
are  properties  of  B-spline  or  B-splinelike  surfaces  which  can  be  exploited 
to  help  assure  global  convergence  of  the  algorithms.  In  particular,  as  B- 
spline  patches  are  recursively  subdivided  [4],  simple  geometric  tests  based 
on  bounds  for  both  principal  radii  of  curvature  of  the  patches  can  be  used 
to  determine  if  it  is  possible  for  a  twinkle  (or  initial  specular  point)  to  exist 
on  that  patch.  With  the  use  of  a  suitable  stopping  point,  the  patches  could 
be  subdivided  (the  majority  being  thrown  out  at  each  step)  until  only  arbi¬ 
trarily  small  patches  exist  which  may  contain  twinkles  (or  initial  specular 
points).  Such  algorithms  should  prove  to  be  both  rapid  and  robust.  In 
addition,  the  method  should  extend  readily  to  the  multiple-bounce  case, 
although  it  is  not  intuitively  obvious  what  the  more  generalized  geometric 
tests  should  be  at  this  time. 
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ABSTRACT  -  A  method  of  processing  data  acquired  from  a  scanning 
spectrometer  is  described.  The  method  employs  an  algorithm  de¬ 
signed  to  find  the  spectral  data  from  a  continuous  stream  of  data 
from  the  spectrometer  and  to  provide  the  experimentor  with  the  rel¬ 
ative  peak  intensities  and  relative  powers  of  the  known  spectral 
lines.  This  algorithm  does  not  require  the  use  of  outside  refer¬ 
ence  sources,  such  as  an  electronic  pulse  synchronized  with  the 
output  of  the  spectral  data,  to  find  each  frame  of  spectral  data. 
This  can  save  thirty-three  percent  of  the  memory  storage  otherwise 
required  for  all  the  data  from  the  spectrometer,  and  approximately 
twenty-five  percent  of  the  processing  time  on  the  computer. 

I.  INTRODUCTION.  The  High  Energy  Laser  Systems  Test  Facility 
( HELSTF )  is  a  research  center  for  testing  effects  on  various  mater¬ 
ials  utilizing  a  high  powered  laser.  The  Army  provides  support  to 
users  in  the  form  of  secure  and  safe  areas  for  performing  experi¬ 
ments,  and  data  acquisition  and  computer  systems  for  collecting  and 
processing  the  data.  The  author,  as  the  data  analyst  insures  that 
the  test  data  is  acquired  successfully  and  processed  satisfactorily 
before  it  is  given  to  the  user.  The  author  presents  in  this  paper 
a  description  of  one  of  the  data  collecting  instruments  used  at 
HELSTF,  a  scanning  spectrometer,  and  the  software  developed  to  pro¬ 
cess  the  data  from  the  scanning  spectrometer.  The  algorithm  is  not 
original,  but  its  use  in  this  particular  application  anywhere  else 
is  unknown  to  this  author. 

Figure  1  shows  a  simplified  diagram  of  the  optical  setup  of 
the  scanning  spectrometer.  This  configuration  is  called  a  double¬ 
path  Czerny-Turner  spectrometer.  Light  from  the  source  enters 
through  the  cassegrain  subsystem  at  the  lower  left  of  the  diagram. 
Optimum  efficiency  of  the  spectrmeter  occurs  when  the  cassegrain 
optics  focuses  the  light  on  the  slit  such  that  the  light  fully  il¬ 
luminates  the  diffraction  grating.  The  light  proceeds  from  the 
slit  to  the  lower  spherical  mirror,  is  reflected  to  the  reflecting 
diffraction  grating,  after  which  the  upper  spherical  mirror  col¬ 
limates  and  directs  the  dispersed  beam  to  the  flat  mirror  which 
reflects  the  dispersed  beam  to  the  scanning  corner  reflector.  The 
scanning  subassembly  consists  of  24  corner  reflectors  attached  to 
a  rotating  disk  with  multiple  rotating  speeds  available.  The 
speed  chosen  here  is  such  that  the  spectrum  is  scanned  800  times 
each  second.  As  the  corner  mirror  scans  the  spectrum,  the  light 
is  reflected  back  through  the  Czerny-Turner  optics  where  it  trav¬ 
els  back  over  the  original  paths  until  it  is  diverted  to  the  two 
indium  antimonide  (inSb)  detectors.  These  detectors  are  cooled 
to  77°K  with  liquid  nitrogen;  and  they  operate  in  the  photocon- 
ductive  mode. 
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Figure  2  shows  a  simplified  functional  diagram  of  the  photo¬ 
metric  process.  The  detector  sees  a  photon  with  frequency  f  and 
its  conductivity  is  modified  which  produces  a  certain  voltage 
level  at  the  output  of  the  circuit.  The  voltage  level  is  measured 
with  an  electronic  meter  and  recorded.  The  responsivity  of  the 
detector  defines  how  much  power  corresponds  to  the  measured  volt¬ 
age.  There  are  many  factors  which  determine  the  responsivity  of 
a  detector;  such  as  the  chemical  composition,  the  environmental 
temperature,  and  so  on.  One  can  easily  read  more  from  a  book  on 
detectors.  The  box  furthest  to  the  right  represents  the  process 
of  calculating  the  absolute  power  from  the  responsivity  values 
and  the  signal  values. 


The  output  from  the  detectors  are  fed  into  some  amplifying 
electronics  and  output  to  BNC  jacks.  The  two  outputs  cover  the 
spectral  region  from  3.6  microns  to  4.05  microns.  The  short 
wavelength  region  covers  3.60  to  3.85  microns  and  the  long  wave¬ 
length  region  covers  3.80  to  4.05  microns.  Prior  to  each  scan 
of  the  spectrum  an  electronic  pulse,  called  the  sync  pulse,  is 
generated  with  a  duration  of  0.250  milliseconds.  This  provides 
for  a  1.0  ms  duration  for  each  frame  of  spectral  data.  This 
allows  for  a  relation  between  the  scan  time  and  the  wavelength 
of  the  spectrum.  Plots  of  the  signals  are  shown  in  Figure  3. 

The  uppermost  plot  is  all  three  signals  multiplexed.  The  lower 
three  plots  are  the  three  individual  signals.  The  process  of 
multiplexing  data  signals  can  be  found  in  most  books  on  digital 
communications.  This  data  is  from  the  spectrum  of  a  chemical 
laser  using  deuterium  and  fluorine.  The  laser  device  is  oper¬ 
ated  by  TRW. 


During  the  test,  data  from  the  spectrometer  is  FM  recorded 
locally  on  three  channels  of  the  analogue  tape.  After  the  test, 
the  FM  tape  is  played  back  and  the  data  digitized  and  multi¬ 
plexed  onto  another  tape.  When  one  wishes  to  process  the  spec¬ 
tral  data,  it  is  transferred  to  disk  and  demultiplexed. 


Originally  the  author's  task  was  to  take  the  demultiplexed 
data  and  develop  software  to  determine  the  spectral  line  inten¬ 
sities  and  calculate  the  relative  powers  within  the  spectral 
lines.  It  was  assumed  that  the  first  spectral  line  always  oc¬ 
curred  within  a  determined  time  interval  and  that  the  distances, 
in  time  or  wavelength,  between  all  the  spectral  lines  remained 
constant.  The  software  was  designed  to  use  as  a  reference  point 
a  specified  level  value  of  the  leading  edge  of  the  sync  pulse. 
Therefore,  the  program  would  read  the  sync  pulse  data  and  upon 
finding  the  reference  point,  it  would  know  that  it  had  to  read 
so  many  points  of  the  short  and  long  wavelength  data  files  be¬ 
fore  reaching  the  first  spectral  lines  in  each  file.  The  pro¬ 
gram  knew  that  the  second  lines  were  a  certain  number  of  data 
points  from  the  first,  the  third  from  the  second,  and  so  on. 
These  distances  were  different  in  the  short  and  long  data  files. 
The  program  was  run  using  test  data  and  the  output  of  the  pro¬ 
gram  seemed  satisfactory;  therefore,  the  task  completed  suc¬ 
cessfully,  until  a  new  requirement  was  generated  by  one  of  the 
users. 
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One  user  was  interested  in  seeing  how  each  spectral  line's 
intensity  varied  during  the  course  of  the  test.  I  merely  had 
to  modify  the  program  so  that  as  the  light  intensities  were 
determined  they  were  written  to  a  file:  one  file  for  each  wave 
length.  The  same  was  done  for  the  energy  in  each  line.  The 
first  time  the  modified  program  obtained  outputs  which  resem¬ 
bled  the  plots  of  the  spectra.  A  careful  analysis  revealed 
that  the  spectra  was  shifting  relative  to  the  sync  pulses. 
However,  the  spacing  between  the  spectral  lines  remained  con¬ 
stant  from  one  frame  of  data  to  the  next.  The  cause  of  this 
shift  was  not  immediately  apparent,  although  it  was  thought 
that  it  was  not  in  the  software  I  had  been  developing.  The 
cause  for  the  shifting  of  the  spectrum  was  found  to  be  in  the 
parameter  values  used  in  the  demultiplexing  algorithm.  The 
error  in  these  values  caused  a  number  of  data  points  to  be 
skipped  over  causing  the  spectrum  data  to  be  shifted  toward 
the  sync  pulse  data. 

It  was  decided  at  this  point  not  to  use  the  sync  pulse 
data  and  to  find  an  algorithm  which  would  identify  the  spec¬ 
tra  using  the  characteristics  of  the  spectrum  itself.  The 
only  characteristic  chosen  was  the  spacing  of  the  spectral 
lines,  which  one  could  consider  as  a  pattern  that  occurred 
periodically  many  times  in  a  long  stream  of  time-series  data. 
Here  was  a  problem  which  was  solvable  using  pattern  recog¬ 
nition  techniques. 

II.  DISCUSSION.  The  technique  employed  is  based  on  the 
convolution  integral.  Here  the  integral  is  expressed  as  a 
summation  because  the  data  consists  of  discrete  points. 

The  expression  is  written  as: 


C(  t) 


f ( i ) s( i-t ) , 


f 


i  =  l 

where  C(t)  is  the  correlation  at  the  t  data  point,  f(i)  is 
the  i£b  point  of  the  template  or  filter,  and  s(i)  is  the 
i th  point  of  the  real  spectrum.  As  t  increases  to  the  end 
of  the  data,  C(t)  will  vary  and  fluctuate  from  relative 
minima  to  relative  maxima  (see  Figure  6).  The  number  of 
data  points  in  one  frame  of  data  is  approximately  626  ±  4. 
There  will  be  one  relatve  maxima  for  every  626  ±  4  consec¬ 
utive  values  of  C(t),  and  at  this  value  of  t,  a  spectrum 
begins.  A  complete  explanation  of  correlation  functions 
can  be  found  in  any  text  book  on  digital  communications 
theory . 

The  software  does  three  major  functions:  reads  the 
data,  scans  the  data  determining  locations  of  the  frames  of 
spectral  data  while  it  finds  the  peak  intensity  of  each  line 
and  the  relative  energy  in  each  line,  and  finally  goes  back 
and  calculates  frame  averages  of  the  spectra  according  to  the 
options  given  to  the  user. 
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FIGURE  6.  PLOT  OF  CORRELATION  COEFFICIENTS ,  C(I),  VERSUS  THE  DATA 
POINTS.  THIS  IS  FROM  A  PARTIAL  SCAN  OF  THE  SHORT  X  DATA. 
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The  user  starts  the  program  and  enters  values  which  a- 
llows  the  program  to  determine  how  much  of  the  data  is  to  be 
processed,  whether  or  not  averages  are  to  be  calculated,  and 
how  many  are  to  be  calculated.  The  data  is  then  read  and 
stored  in  memory,  where  now  one-third  less  data  is  needed  to 
be  read  and  stored  because  the  sync  pulse  data  is  no  longer 
necessary.  The  spectral  data  is  scanned  and  the  locations 
of  the  spectra  are  determined  and  stored  in  memory.  (See 
Figures  4,  5,  7,  and  8).  As  the  scanning  is  proceeding  each 
value  of  the  spectral  line  peak  intensity  is  stored  in  a 
separate  file,  one  file  for  each  wavelength  where  a  spectral 
line  exists.  The  same  is  done  with  the  energies  within  each 
line . 


There  are  two  detectors  used  and,  therefore,  values  are 
different  even  for  the  same  wavelength  signal  input.  There¬ 
fore,  the  ratio  of  the  energy  contained  within  a  line  seen 
by  both  detectors  is  taken  and  the  data  from  one  detector 
corrected  for  the  differences  between  the  detectors.  The 
short  and  long  spectral  data  is  concatenated  at  the  overlay¬ 
ing  regions  to  show  the  entire  spectrum,  as  shown  in  Figure 
9. 


Finally  the  program  calculates  averages  if  they  were  re¬ 
quested.  If  not,  the  program  is  finished.  If  averages  are 
requested,  then  the  number  of  sets  of  frames  will  be  averaged. 
If  more  than  one  set  of  frames  is  averaged,  then  the  first  set 
of  N  frames  is  averaged,  then  the  next  set  is  made  up  of  aver¬ 
aging  N  frames  starting  with  the  second  frame  and  including 
the  (N+l)£h.  frame.-  This  continues  until  the  program  executes 
all  the  instructions  based  on  the  values  the  user  inputs  at 
the  start  of  the  program. 


III.  CONCLUSION.  The  degree  to  which  the  software  is  suc¬ 
cessful,  in  its  ability  to  recognize  the  spectrum,  is  obvious 
as  can  be  seen  in  Figure  9.  The  top  plot  shows  a  single 
frame  of  data.  The  bottom  plot  shows  the  average  of  200  con¬ 
secutive  frames  which  were  averaged  together.  The  width  of 
the  lines  in  the  averaged  frame  are  the  same  as  those  in  the 
nonaveraged  frame.  This  would  not  be  the  result  if  the  rec¬ 
ognition  of  the  spectrum,  by  the  artificial  spectrum  (tem¬ 
plate  or  filter)  was  off  by  even  one  data  point  in  position. 
This  is  the  purpose  of  the  algorithm  because  once  the  spec¬ 
trum  is  located,  the  calculations  are  straight  forward.  The 
relative  energies  in  the  spectral  lines  were  calculated  using 
the  trapezoidal  rule.  The  software  I  developed  was  done  us¬ 
ing  FORTRAN  77  on  a  VAXll-780  computer  system. 
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THE  KP  EQUATION  -  A  COMPARISON  TO  LABORATORY 
GENERATED  BIPERIODIC  WAVES* 

Norman  W.  Scheffner 

U.S.  Army  Engineer  Waterways  Experiment  Station 
Coastal  Engineering  Research  Center 
Vicksburg,  MS  39180 


ABSTRACT.  The  propagation  of  waves  in  shallow  water  is  a  phenomenon  of 
significant  practical  importance.  The  ability  to  realistically  predict  the 
complex  wave  characteristics  occur ing  in  shallow  water  regions  has  always  been 
an  engineering  goal  which  would  make  the  development  of  solutions  to  practical 
engineering  problems  a  reality.  The  difficulty  in  making  such  predictions 
stems  from  the  fact  that  the  equations  governing  the  complex  three-dimensional 
flow  regime  can  not  be  solved  without  linearizing  the  problem.  The  linear 
equations  are  solvable;  however,  their  solutions  do  not  reflect  the  nonlinear 
features  of  naturally  occuring  waves.  A  recent  advance  (1984)  in  nonlinear 
mathematics  has  resulted  in  an  explicit  solution  to  a  nonlinear  equation 
relevant  to  water  waves  in  shallow  water.  The  solution  possesses  features 
found  in  observed  nonlinear  three-dimensional  wave  fields. 

The  nonlinear  mathematical  formulation  referred  to  above  has  never  been 
compared  with  actual  waves,  so  that  its  practical  value  is  unknown.  The 
purpose  of  the  present  investigation  was  to  physically  generate  three- 
dimensional  waves  and  compare  these  with  exact  mathematical  solutions.  The 
goals  were  successfully  completed  by  first  generating  the  necessary  wave 
patterns  with  the  new  U.S.  Army  Engineer  Waterways  Experiment  Station,  Coastal 
Engineering  Research  Center's  (CERC)  directional  spectral  wave  generation 
facility.  The  theoretical  solutions  were  then  formed  through  the  determination 
of  a  unique  correspondence  between  the  free  parameters  of  the  solution  and  the 
physical  characteristics  of  the  generated  wave. 

I.  INTRODUCTION.  One  of  the  first  mathematical  models  of  nonlinear 
waves  in  shallow  water  with  known  solutions  was  presented  by  Korteweg  and 
deVries  in  their  famous  1895  paper.  Their  formulation,  known  as  the  KdV 
equation,  can  be  written  in  the  following  nondimensional  form 

**t  +  ^^x  +  ^xxx  ®  ^ 1  ^ 

in  which  f  represents  the  water  surface  displacement,  x  is  the  direction 
of  propagation,  and  t  is  time.  This  equation  admits  not  only  solitary  wave 
solutions  but  also  the  periodic  solutions  commonly  known  as  cnoidal  waves. 


Presented  at  the  20th  International  Conference  on  Coastal  Engineering, 
November  9-14,  1986,  Taipei,  Taiwan  and  included  in  the  proceedings  thereof 
entitled  "Biperiodic  Waves  in  Shallow  Water” 


These  solutions  can  be  written  as 


f(x,t)  -  2o2k2cn2(0;k)  -  2<j2||j£|  -  1  +  k2J  (2) 

where  each  of  the  terms  in  the  solution  are  well  documented  analytic  functions 
which  can  easily  be  computed  in  terms  of  known  wave  characteristics  such  as 
wave  height  and  wavelength.  Unfortunately,  cnoidal  wave  solutions  are  valid 
only  for  long  crested  waves,  e.g.,  waves  which  can  be  described  by  a  single 
time-dependent  one-dimensional  surface  wave  pattern.  Natural  waves,  in 
contrast,  are  composed  of  both  long  and  short  crested  waves  and  can  not  be 
adequately  described  by  this  theory. 

A  recent  advance  in  nonlinear 'mathematics  has  been  reported  by  Segur  and 
Finkel  ( 1 984 ) .  They  present  explicit  analytical  solutions  to  a  natural  three- 
dimensional  extension  of  the  KdV  equation  proposed  by  Kadomtsev  and 
Petviashvili  (1970),  known  as  the  KP  equation  shown  below 

(ft  ♦  6ffx  +  ^xxx^x  +  ^yy  *  0 

where  x  now  represents  the  primary  direction  of  propagation;  however,  weak 
changes  in  the  y-direction  are  now  permitted.  When  no  y- variations  occur,  the 
KP  equation  reverts  to  the  KdV  equation. 

The  KP  equation  admits  an  infinitely  dimensional  family  of  exact, 
periodic  solutions  (see  Dubrovin,  1981  and  Segur  and  Finkel,  1984)  which  can 
be  written  in  the  form 


f (x.y.t) 


,32me 


3x 


(4) 


where  8  is  a  Riemann  theta  function  of  genus  n  .  Genus  1  solutions  are 
exactly  equivalent  to  cnoidal  waves,  they  are  permanent  form,  singly  periodic, 
two-dimensional  (one  vertical  and  one  horizontal)  nonlinear  waves.  Genus  2 
waves  are  biperiodic  in  that  they  permit  the  independent  specification  of  two 
periodicities  in  both  time  and  space.  The  solutions  are  genuinely  three- 
dimensional,  nonlinear,  and  propagate  with  permanent  form  at  a  constant 
velocity.  Genus  3  and  higher  order  solutions  are  multi-periodic  and  can  not 
be  characterized  as  permanent  form  with  respect  to  any  translating  coordinate 
system  as  the  genus  1  and  2  solutions  can.  This  present  investigation  is 
limited  to  the  genus  2  solutions  developed  by  Segur  and  Finkel. 

The  construction  of  a  genus  2  solution  of  the  KP  equation  is  based  on  the 
specification  of  the  appropriate  Riemann  theta  function.  This  requires  the 
introduction  of  a  two-component  phase  variable  and  a  2X2  real-valued 
Riemann  matrix.  The  first  of  these,  the  phase  variable,  is  shown  below. 

<j>1  -  i^x  +  v,y  +  (^t  +  <f>10 


and 


(5) 


Where  the  parameters  u1  »  u2 


V1 

and 


are  angular  frequencies,  and  T  Q 
significance.  The  second  ingredient 
valued,  negative  definite,  symmetric 

B-f 

\bX 


and  Vg  are  wave  numbers,  ui ^  and 
4>_n  are  constants  with  no  dynamical 
involves  the  specification  of  a  real- 
2X2  Ri.emann  matrix  as  shown  below. 
bX 


bX2  + 


(6) 


The  parameters  b  ,  d  ,  and  X  represent  solution  nonlinearity.  The  genus  2 
theta  function  can  now  be  defined  in  terms  of  the  above  components  by  the 
following  double  Fourier  series: 


9(^ 


m. 


m2*-® 


e  xp  ( ^  "m“ 


B*  -m+im*<t>) 


(7) 


The  calculation  of  a  general  case  genus  2  KP  solution  requires  the 
specification  of  the  11  parameters  shown  in  Equations  5  and  6.  Two  of  these 
parameters  (  $  and  <j>  )  have  no  dynamical  significance,  their  only  effect 

is  to  shift  the  origin  or  the  resulting  solution.  Dubrovin  (1981)  proved  that 
a  genus  2  theta  function  in  the  form  of  Equation  7  was  a  solution  to  the  KP 
equation  if,  and  only  if,  the  solution  parameters  were  related  by  four 
additional  equations.  One  of  these  equations  contains  a  constant  of 
integration.  Use  of  this  additional  criteria  reduces  the  number  of  free 
parameters  to  8,  representing  the  minimum  number  of  free  parameters  required 
to  specify  a  general  case  genus  2  solution. 

Genus  2  solutions  of  the  KP  equation  describe  a  complex  two-dimensional 
surface  wave  pattern.  Similar  features  were  observed  by  Hammack  (1980)  to 
result  from  the  nonlinear  interaction  of  two  intersecting  waves.  The 
theoretical  development  by  Segur  and  Finkel  was  partially  prompted,  in  fact, 
by  these  reported  waves.  The  development  of  an  experimental  program  which 
would  result  in  the  generation  of  surface  wave  patterns  qualitatively  similar 
to  genus  2  solutions  was  achieved  by  attempting  to  experimentally  reproduce 
the  conditions  reported  by  Hammack,  i.e.,  intersecting  waves.  This  generation 
technique  can  best  be  described  by  presenting  the  analogy  of  interacting 
waves.  Consider,  for  example,  two  periodic  waves  which  intersect  and  pass 
through  each  other  as  shown  in  Figure  1.  The  angles  a.  and  a  represent  the 
angle  of  the  crest  of  each  wave  front  with  respect  to  some  reference  line. 

The  resulting  surface  wave  pattern,  according  to  linear  wave  theory,  would 
simply  be  a  superposition  of  the  two  individual  waves.  This  would  produce  a 
diamond  shaped  surface  pattern  as  indicated  in  Figure  1 .  It  can  be  seen  that 
certain  of  the  basic  character istics  of  the  individual  waves,  wavelength  and 
angle  of  propagation  for  example,  have  been  preserved. 
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Figure  1 .  The  Linear 
Intersection  of  Waves 
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Figure  2.  The  Nonlinear 
Intersection  of  Waves 


Now,  consider  the  analagous  case  in  which  similar lly  intersecting  waves 
interact  nonlinearily  with  each  other.  This  scenario  is  shown  schematically 
in  figure  2.  The  resulting  wave  pattern  shows  that  a  "stem  of  interaction"  is 
formed  at  the  point  where  the  two  waves  cross  each  other.  The  formation  of 
this  stem  region  is  a  result  of  a  phase  shift  in  the  crest  line  angles  of  the 
original  waves.  This  phenomonon  is  shown  in  Figure  2  superimposed  on  the 
corresponding  linear  wave  solution.  The  resulting  surface  wave  pattern  now 
assumes  a  hexagonal  pattern  in  which  a  third  wave  crest,  seperate  of  the 
original  two,  is  formed.  This  phase  shift  and  stem  formation  are  indicative 
of  the  nonlinear  interaction  of  the  two  waves  3ince  the  exact  linear  solution 
does  not  predict  either  the  phase  shift  or  the  new  wave  crest.  Genus  2 
solutions  of  the  KP  equation  predict  these  features  and  was  tested  as  a 
possible  model  for  their  description. 

II.  LABORATORY  FACILITIES  AND  EXPERIMENTAL  PROCEDURES.  A  project  was 
initiated  at  CERC  to  generate  three-dimensional  nonlinear  wave  fields  in  the 
laboratory  and  then  apply  KP  theory  to  the  resulting  waves  in  order  to 
determine  whether  or  not  the  KP  equation  was  a  model  for  these  waves  and,  if 
so,  what  wa3  the  range  of  its  applicability.  This  required  the  use  of  the  CERC 
directional  spectral  wave  generation  facility.  This  unique  wave  generator, 
shown  in  Figure  3,  was  designed  and  constructed  for  CERC  by  MTS  Systems 
Corporation  of  Minneapolis,  Minnesota  based  on  design  specifications  provided 
by  CERC.  The  generator  is  comprised  of  60  individually  programmable 
electromechanical  wave  paddles.  Each  wave  paddle  is  1.5  ft  wide  making  the 
generator  a  total  of  90.0  ft  wide.  The  generator  is  located  in  a  98.0  by 
18H.0  ft  wave  basin  with  2.5  ft  high  side  walls.  Computer  control  of  the 
system  is  provided  by  a  Digital  Equiptment  Corporation  (DEC)  VAX  11/750 
central  processing  unit.  The  above  facilities  were  utilized  to  generate  genus 
2  candidate  waves  in  a  comprehensive  experimental  program. 


Figure  3-  The  Directional  Spectral  Wave  Generator 
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The  wave  generator  was  programmed  to  simultaneously  generate  intersecting 
cnoidal  wave  trains.  A  variety  of  wave  fields  were  generated  by  varying  both 
the  wavelength  of  the  individual  waves  and  their  angle  of  intersection.  Twelve 
wave  fields,  generated  in  this  manner,  were  used  to  test  the  KP  equation.  The 
wave  fields  selected  for  the  experimental  program  are  presented  in  Table  1. 
Waves  characterized  by  three  wavelengths  (7,  11,  and  15  ft)  were  combined  with 
phase  shifts  between  adjacent  wavemaker  paddles.  These  phase  shifts  were 
approximately  equivalent  to  the  angle  of  the  wavecrest  with  respect  to  the 
axis  of  the  wave  generator.  The  angle  in  the  table  shows  the  approximate 
correspondence  between  the  phase  lag  and  the  angle  of  propagation. 

Table  1 

The  experimental  waves 


Test  Number 

Wavelength 

(ft) 

Phase  Shift 
(deg) 

Angle  (deg) 

Period  (sec) 

CN1 007 

7.0 

10.0 

7.45 

1.378 

CN1 507 

7.0 

15.0 

1 1 .21 

1.378 

CN2007 

7.0 

20.0 

15.03 

1 . 378 

CN3007 

7.0 

30.0 

22.89 

1.378 

CN4007 

7.0 

40.0 

31  .23 

1.378 

CN1011 

1 1  .0 

10.0 

11.75 

1.947 

CN 1511 

11.0 

15.0 

17.79 

1.947 

CN2011 

1  1  .0 

20.0 

24.04 

1.947 

CN3011 

11 .0 

30.0 

37.67 

1.947 

CN1015 

15.0 

10,0 

16.12 

2.553 

CN1515 

15.0 

15.0 

24.62 

2.553 

CN2015 

15.0 

20.0 

33.75 

2.553 

Genus  2  solutions  can  be  visualized  as  a  series  of  repeating  two- 
dimensional  permanent  form  surface  patterns,  referred  to  as  period  parallel¬ 
ograms,  These  patterns  translate  at  a  constant  velocity  in  a  constant 
direction.  The  global  wave  field  is  represented  by  a  tiling  of  these  basic 
patterns;  therefore,  the  entire  wave  pattern  can  be  exactly  specified  by 
quantifying  just  one  period  parallelogram .  The  location  of  a  basic  parallel¬ 
ogram  within  the  hexagonal  wave  field  of  Figure  2  is  shown  in  Figure  4,  The 
phase  variables  of  Equation  5  define  the  horizontal  limits  of  these  patterns 
such  that  each  side  is  uniquely  defined  by  <j>  =  constant  and  <|>2  =  constant. 

The  components  of  the  Riemann  matrix  define  the  vertical  and  horizontal 
distribution  within  the  period  parallelogram. 
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Figure  4.  The  Period  Parallelogram 

Detailed  measurements  of  each  of  the  generated  wave  fields  shown  in 
Table  1  were  required  in  order  to  relate  the  physical  characteristics  of  the 
waves  to  the  parameters  of  the  corresponding  period  parallelogram  of  the  exact 
solution.  This  quantification  was  accomplished  by  first  using  overhead 
photography  to  determine  the  dimensions  of  the  period  parallelogram  and  to 
provide  an  estimate  of  the  internal  features,  such  as  the  phase  shift  and  stem 
length.  Knowledge  of  these  horizontal  features  and  their  location  within  the 
wave  tank  were  then  used  to  locate  a  linear  array  of  9  recording  wave  gages  in 
the  wave  basin.  This  approach  provided  a  vertical  wave  record  which  could  be 
identified  with  a  known  location  within  the  parallelogram. 

III.  COMPARING  THEORETICAL  SOLUTIONS  TO  OBSERVED  WAVES.  The  experimental 
program  described  above  generates  symmetric  cnoidal  waves  (  'o^  =  in 

Figure  1)  resulting  in  a  symmetric  period  parallelogram .  This  simplification 
was  adopted  so  that  the  generated  wave  patterns  would  all  propagate 
perpendicularly  off  the  axis  of  the  wave  generator,  making  it  possible  to 
measure  all  wave  forms  with  a  single  stationary  wave  gage  array.  Symmetry  also 
reduces  the  number  of  free  parameters  which  need  to  be  specified,  for  example, 
y  =  y0  ,  v.  -v2  ,  and  w.  =  w2  from  Equations  5.  This 

simplification  results  in  the  requirement  of  only  three  dynamical  parameters 
and  two  nondynamical  parameters.  The  parameters  choosen  were  b  ,  y  ,  and 
X  along  with  the  phase  shift  parameters  <J>^  ^  and  .  The  following 

sequence  of  events  was  used  for  optimizing  these  coefficients.  Experiment 
CN3007  will  be  used  to  demonstrate  the  verification  process. 

Each  of  the  waves  of  Table  1  were  generated  in  the  wave  basin.  Two 
overlapping  photographs  were  taken  with  dual  Hasselbladt  model  500EL./M  70mm 
cameras  equipped  with  50mm  lenses  mounted  23  ft  above  the  floor  of  the  basin. 
The  resulting  mosaic  photograph,  shown  in  Figure  5,  was  used  to  estimate  the 
length  and  width  of  the  period  parllelogram .  This  resulted  in  estimates  for 
y  =  y  and  v  =  -v  .  An  estimate  for  the  phase  shift  parameter  X  was 
also  determined  from  the  photograph.  The  accuracy  of  y  ,  v  ,  and  X  is  a 
function  of  the  distortions  in  the  photograph.  Because  of  this  distortion, 
their  values  were  considered  to  be  initial  estimates.  Following  the 
photographing  of  all  waves,  a  gage  spacing  of  2.5  ft  apart  and  40.0  ft  from 
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Figure  5.  Overhead  Mosaic  Photograph  of  Test  Wave  CN30O7 
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and  parallel  to  the  generator  was  selected  for  use  in  all  tests.  The  location 
of  each  of  the  gages  with  respect  to  wave  CN3007  is  shown  in  Figure  5.  It  can 
be  seen  that  each  gage  can  be  uniquely  referenced  according  to  a  distance  from 
the  center  of  the  parallelogram.  Since  all  parallelograms  are  identical,  wave 
gages  located  in  an  adjacent  parallelogram  can  be  referenced  to  the  common 
center  point. 

Wave  gages  were  located  in  the  basin  and  each  of  the  waves  of  Table  1 
were  regenerated.  Data  were  sampled  for  each  of  the  gages  at  a  rate  of  50 
samples  per  second  for  a  total  of  30.0  seconds.  Figure  6  shows  the  wave 
traces  for  CN3007.  The  correspondence  between  the  wave  traces  and  their 
location  within  the  parallelogram  can  easily  be  seen.  For  example,  gage  5  is 
located  on  a  stem  where  only  one  peak  per  passing  of  the  parallelogram  is 
experienced.  Gage  3  is  located  in  the  saddle  region  where  two  smaller  peaks 
per  period  are  seen.  This  comparison  demonstrates  the  usefulness  of  the 
photographs  in  interpreting  the  data  since  three-dimensional  effects  are 
difficult  to  deduce  from  two-dimensional  data. 

The  determination  of  the  free  coefficients  can  now  be  made.  Known  or 
estimated  data  are  the  period  of  the  wave  (determined  from  the  recording  wave 
gages),  the  length  and  width  of  the  period  parallelogram  and  an  estimate  of 
the  phase  shift  parameter  X  determined  from  the  photographs,  and  a  maximum 
wave  height  selected  from  the  wave  gage  data.  The  following  iteration 
procedure  was  used  to  optimize  the  coefficients: 


a.  The  estimated  values  for  p,  -  p2,  -  -v2>  and  X  were  specified. 

The  nondynamical  parameters  <t>1Q  ana  $  were  accounted  for  by  specifying 

solutions  to  be  computed  at  location  within  the  period  parallelogram 
corresponding  to  the  location  of  the  wave  gages.  A  value  of  b  was  then 
selected  such  that  the  di mens ionali zed  maximum  KP  solution  was  within  5.0 
percent  of  the  measured  value. 


b.  The  value  of  y  -  y2  was  adjusted,  if  necessary,  until  the 
dimensional! zed  period  was  within  3.0  percent  of  the  measured  period. 


c.  The  value  of 
value  of 


was  adjusted,  if  necessary,  until  the  dimensional! zed 
was  within  10.0  percent  of  the  estimated  value.  A  10- 


percent  criteria  was  used  for  this  iteration  since  the  length  of  the 
parallelogram  was  difficult  to  determine  from  the  photographs. 


d.  Because  of  the  nonlinear  coupling  of  the  solution  coefficients,  each 
adjustment  affected  all  parameters  to  some  extent.  If  corrections  were  found 
to  be  necessary,  steps  (a_j_)  through  (c^)  were  repeated  until  all  of  the 
specified  tolerances  were  met  or  exceeded.  Possible  phasing  problems 
regarding  the  gage  locations  within  the  parallelogram  were  rectified  by 
adjusting  the  nondynamical  phase  parameters. 


e.  A  KP  solution  corresponding  to  the  location  of  each  of  the  wave  gages  was 
calculated.  A  normalized  plot  comparing  theory  to  measurements  was  made,  as 
shown  in  Figure  7  for  the  present  example.  Included  in  each  plot  is  the  Root 
Mean  Square  (RMS)  error  for  each  comparison. 
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f.  A  normalized  contour  plot  (Figure  8)  and  a  three-dimensional  plot 
(Figure  9)  for  each  wave  field  was  finally  prepared  as  a  visual  example  of  the 
KP  solution. 


The  above  procedures  were  followed  for  each  of  the  test  wave  fields  of 
Table  1.  A  minimum  tolerance  of  5.0  percent  for  waveheight,  3.0  percent  for 
period,  and  10.0  percent  for  the  Y-direction  wavelength  was  maintained  for  all 
experiments.  Table  2  presents  those  computed  results.  For  each  case,  an 
average  RMS  error  is  provided  which  represents  a  simple  average  of  the  9  RMS 
values  computed  for  each  gage.  In  no  case  did  this  er^or  exceed  20  percent 
even  though  variations  in  the  elevation  of  the  basin  floor  of  10  percent  were 
known  to  exist.  Additionally,  the  experimental  wave  fields  were  generated 
almost  to  the  point  of  breaking  in  order  to  span  the  range  of  solution 
parameters  and  investigate  the  limits  of  applicability  of  the  genus  2 
solutions.  In  view  of  these  introduced  and  existing  sources  of  potential 
error,  the  degree  of  fit  between  the  generated  wave  fields  and  the  exact 
solutions  were  found  to  be  very  good. 

Table  2 

Computed  wave  parameters 


Test 

Number 

Max.  Height 
(in) 

X-Wavelength 

(ft) 

Y-Wavelength 

(ft) 

Ave.  RMS  Error 

2.44 

7.0 

46.5 

0.141 

CN1 507 

3.59 

7.2 

35.1 

0.188 

3.06 

7.5 

27.3 

0.150 

CN3007 

3.24 

7.9 

17.0 

0.143 

CN4007 

3.30 

8.7 

13.6 

0.184 

CN1  01 1 

2.23 

10.7 

48.0 

0.174 

CN151 1 

2.87 

11.1 

40.3 

0.122 

CN2011 

3.10 

1 1  .6 

27.6 

0.126 

CN301 1 

2.48 

12.6 

20.7 

0.172 

CN1015 

2.65 

15.0 

59.3 

0.120 

CN1515 

2.84 

16.1 

32.6 

0.094 

CN201 5 

2.86 

17.1 

29.0 

0.098 

IV.  CONCLUSIONS.  Twelve  seperate  nonlinear  wave  fields  were  generated  for 
the  purpose  of  verifying  the  KP  equation  to  be  an  accurate  model  for  three- 
dimensional  nonlinear  waves.  Criteria  were  developed  which  provided  a  unique 
correspondence  between  the  solution  parameters  of  the  KP  equation  and  the 
physical  characteristics  of  the  laboratory  generated  waves.  Results  of  these 
experiments  showed  that  both  the  generated  waves  and  the  genus  2  solutions  are 
remarkably  robust  in  that  both  were  stable  over  a  wide  range  of  parameters, 
including  the  near  breaking  of  waves.  The  excellent  degree  of  fit  between  the 
observed  and  computed  solutions  shows  that  the  genus  2  solutions  of  the  KP 
equation  represent  a  viable  model  for  three-dimensional,  nonlinear,  shallow 
water  waves. 
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ABSTRACT:  Conventional  asymptotic  methods  often  fail  to 

capture  effects  that  are  transcendentally  small  because  these  effects 
lie  beyond  all  orders  in  the  asymptotic  expansion.  New  methods 
have  been  developed  recently  to  find  transcendentally  small  terms 
in  problems  where  they  control  the  entire  solution.  This  paper 
surveys  some  of  these  recent  developments. 

TEXT:  When  a  differential  equation  that  cannot  be  solved 
exactly  contains  a  small  parameter,  conventional  asymptotic  methods 
often  can  be  used  effectively  to  find  approximate  solutions!  1].  In  a 
small  but  significant  class  of  problems  these  conventional  methods 
uttterly  fail,  because  they  provide  no  nontrivial  information  at  any 
order  of  the  expansion.  In  these  problems  it  is  necessary  to  go 
beyond  all  orders  in  the  asymptotic  expansion  to  answer  questions  of 
interest.  "Asymptotics  beyond  all  orders"  describes  the  more 
delicate  methods  required  to  obtain  information  in  these  pathological 
problems. 

The  essential  problem  can  be  seen  in  a  simple  function  like 

f(€)  =  exp(-l/  €  ),  0<  €  «1.  (1) 

The  function  is  well-defined,  and  it  is  positive  for  any  positive  6  ,  no 
matter  how  small.  The  function  is  not  analytic  at  e  =  0,  so  it  has  no 
Taylor  series  there.  If  one  ignores  this  fact  and  tries  to  evaluate  f(  €) 
for  small  €  with  a  formal  "Taylor  series",  one  obtains  zero  at  every 
order  of  the  expansion: 


(Here  ~  means  "is  asymptotically  approximated  by"  [1].)  Clearly  (2) 
does  not  imply  that  the  function  is  zero,  but  simply  that  the 
expansion  is  too  crude  to  evaluate  it. 

This  example  is  too  simple  to  be  realistic,  but  a  variation  of  it  is 
common:  the  function  is  not  given  explicitly,  as  in  (1),  but  rather  it  is 
defined  implicitly  by  a  differential  equation.  For  example,  let 

D(y,  x;  €)  =  0,  0<  €  «1,  (3) 

represent  some  differential  equation  which,  along  with  boundary 
conditions,  uniquely  defines  its  solution  y(x;  €).  We  imagine  that  the 
equation  comes  from  some  application,  and  that  the  question  of 
interest  is  to  determine  the  sign  of  y(0;  €),  the  solution  of  (3)  at  x=0. 

If  the  differential  equation  cannot  be  solved  exactly,  as  most  cannot, 
conventional  asymptotic  methods  [1]  can  be  used  to  generate  an 
approximate  solution.  In  the  simplest  cases,  the  series  contains  only 
increasing  powers  of  €: 

y(x;  €)  ~  y6(x)+  €  y((x)  +  e'y^x)  +...  . 

How  many  terms  are  needed  in  this  series  depends  on  the  problem, 
but  in  any  case  the  sequential  approximating  functions  are  obtained 
explicitly.  With  these  functions  explicitly  in  hand,  one  evaluates 
them  sequentially  at  x=0  to  obtain  an  increasingly  accurate 
description  of  the  desired  function,  y(0;  €).  Ordinarily  this  approach 
is  successful,  but  occasionally  one  finds  that  at  x=0: 

y.(0)  =  0,  y,  (0)  =  0,  yt(0)  =  0 . 

and  one  can  show  recursively  that  at  every  order  of  the  expansion, 
yh(0)  =  0.  In  this  problem,  therefore,  one  has  shown  that  y(0;  €) 
vanishes  to  all  orders  in  the  asymptotic  expansion.  It  follows  either 
that  y(0;  €)  =  0,  or  that  y(0;  €)  is  transcendentally  small,  as  in  (1). 
Thus  the  calculation  has  failed  completely  to  answer  the  question  of 
interest:  whether  or  not  y(0;  €)  is  positive.  This  failure  persists  even 
if  the  expansion  is  carried  to  all  orders;  in  this  problem  it  is  simply 
too  crude  to  answer  the  question. 

At  this  point  the  reader  may  concede  that  conventional 
asymptotic  methods  cannot  detect  transcendentally  small  terms,  but 
may  wonder  why  anyone  would  care  about  such  small  effects.  I  now 
describe  briefly  some  problems  in  which  this  issue  has  arisen,  and  in 
which  the  most  fundamental  questions  about  the  problem  hinge  on 


whether  certain  transcendentally  small  terms  do  or  do  not  vanish. 

In  these  problems,  questions  like  "Does  a  solution  exist?"  cannot  be 
answered  without  going  beyond  all  orders  in  the  asymptotic 
expansion.  This  brief  survey  does  not  discuss  how  to  solve  these 
problems  ,  but  references  to  the  recent  literature  are  given  below. 

It  should  be  mentioned  that  transcendentally  small  effects 
have  been  evaluated  in  particular  linear  problems  over  the  last  30 
years  [2,  3,  4,  5,  6].  What  distinguishes  the  recent  flurry  of  activity  is 
the  realization  that  no  linear  structure  is  required,  and  the  recent 
work  has  treated  fully  nonlinear  problems.  On  the  other  hand, 
almost  all  of  the  recent  work  that  has  appeared  in  print  to  date  is 
formal,  with  no  assertion  of  rigor. 

My  first  example  is  known  as  "viscous  fingering  of  fluids",  or 
as  the  "Saffman-Taylor  paradox",  after  the  famous  paper  by  these 
authors  [7].  Motivated  by  a  problem  of  interest  in  petroleum 
engineering,  Saffman  and  Taylor  studied  the  slow  motion  of  the 
interface  between  two  fluids  of  different  viscosities  (such  as  oil  and 
water).  They  found  experimentally  that  when  the  fluids  were 
confined  to  a  narrow  gap  between  two  parallel  walls  (a  Hele-Shaw 
cell),  the  less  viscous  fluid  could  be  made  to  push  steadily  into  the 
more  viscous  fluid  in  a  single  symmetric,  uniformly  growing  "finger”. 
They  also  found  experimentally  that  the  width  of  this  finger  far  from 
the  tip  was  extremely  predictable  (  A  =  1/2  in  their  dimensionless 
notation)  in  the  appropriate  range  of  their  experimental  parameters. 

In  the  same  paper,  the  authors  analyzed  the  (Navier-Stokes) 
equations  of  motion,  seeking  a  steady-state  solution  to  describe  this 
steadily  growing  finger.  They  found  that  if  they  made  certain 
plausible  approximations,  including  neglecting  surface  tension,  then 
they  could  find  a  continuous  family  of  exact  solutions  of  the  resulting 
equations.  This  family  was  parameterized  by  X,  the  finger  width. 

The  solution  corresponding  to  A  =1/2  agreed  well  with  their 
experimental  data.  However,  the  question  of  identifying  the 
selection  mechanism  that  picked  out  A.  =1/2  in  their  experiments 
remained  open. 

The  hypothesis  that  surface  tension  provided  the  selection 
mechanism  was  tested  by  McLean  and  Saffman  [8],  who  developed 
an  asymptotic  expansion  for  the  shape  of  the  finger  in  powers  of  the 
(small)  surface  tension,  starting  at  zeroth  order  with  a  Saffman- 
Taylor  solution.  These  exact  solutions  were  left-right  symmetric,  and 
McLean-Saffman  intended  to  show  that  this  symmetry  was  broken  in 
the  presence  of  any  small,  positive  surface  tension.  To  their  surprise 
they  found  that  the  symmetry,  and  therefore  the  continuous  family 
of  solutions  found  in  [7],  persisted  to  all  orders  in  their  asymptotic 
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expansion;  i.e.,  they  found  no  analytical  evidence  that  surface  tension 
provided  the  selection  mechanism  observed  experimentally.  Adding 
to  the  confusion  were  numerical  experiments  by  them  [8]  and  by 
Vanden  Broeck  [9],  using  finite  values  for  the  surface  tension,  which 
seemed  to  indicate  that  surface  tension  did  provide  a  selection 
mechanism. 

The  paradox  was  resolved  in  three  papers  published 
simultaneously  [10,  11,  12].  Each  of  these  papers  showed  that  small 
surface  tension  does  indeed  break  the  symmetry  and  destroy  the 
continuous  family  of  solutions.  However,  the  symmetry  is  broken  by 
an  exponentially  small  amount,  so  this  breaking  lies  beyond  all 
orders  in  the  asymptotic  expansion  in  [8],  and  it  is  not  captured  by 
that  analysis.  This  is  an  example  of  a  problem  of  physical  interest  in 
which  the  most  basic  question  one  can  ask  about  the  model,  whether 
it  even  has  a  solution  for  arbitrary  values  of  A,  cannot  be  answered 
without  going  beyond  all  orders  in  the  asymptotic  expansion. 

A  second  example  arises  in  the  study  of  growing  crystals  in  a 
supercooled  melt  of  a  pure  substance.  The  verbal  description  of  the 
problem  is  quite  similar  to  that  of  the  viscous  fingers.  Under 
appropriate  conditions,  a  solid  crystal  is  observed  to  grow  into  the 
liquid  melt.  The  overall  shape  of  the  crystal  is  complicated  and  time- 
dependent,  but  the  tip  apparently  grows  with  a  nearly  constant 


shape  and  at  a  nearly  constant  speed.  From  the  speed  and  radius  at 
the  tip,  one  can  form  a  dimensionless  Peclet  number,  and  this 
number  is  observed  experimentally  to  depend  only  on  the  substance 
in  question  and  on  its  temperature. 

Important  theoretical  work  was  done  by  Ivanstov  [13],  who 
found  that  by  neglecting  surface  tension,  he  could  produce  a 
continuous,  one-parameter  family  of  exact,  steadily  growing,  two- 
dimensional  crystal  shapes,  called  "needle  crystals".  These  were 
later  generalized  to  three-dimensional  needle  crystals  with 
ellipsoidal  symmetry  [14].  In  both  cases  the  free  parameter  was  the 
Peclet  number.  Thus  we  have  a  second  paradox:  the  theory  predicts 
a  needle  crystal  for  every  Peclet  number,  while  the  experiments 
show  that  one  Peclet  number  is  always  selected.  Again  the  question 
arises:  What  is  the  selection  mechanism?  In  particular,  does  a  small 
amount  of  surface  tension  break  up  the  continuous  family  of  exact 
solutions? 

With  surface  tension  included,  the  exact  governing  equations 
for  this  problem  are  quite  complicated  [15],  and  two  simplified 
models  were  constructed  to  help  to  guide  the  analysis[16,  17].  In  a 
paper  which  (embarrassingly)  is  still  unpublished,  Kruskal  and 
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Segur[18]  obtained  the  following  results  for  one  of  these  models  (the 
"geometric  model"). 

(a)  Without  surface  tension,  the  model  admits  a  spatially  symmetric 
needle -crystal  solution  for  every  Peclet  number. 

(b)  With  or  without  surface  tension,  every  needle-crystal  solution  in 
this  model  must  be  spatially  symmetric. 

(c)  For  small  surface  tension  and  for  every  Peclet  number,  the 
model  admits  an  asymptotic  expansion  for  a  needle  crystal  that  is 
symmetric  to  all  orders  in  the  expansion. 

(d)  For  sufficiently  small  surface  tension,  every  solution  of  the 
model  is  asymmetric.  The  amount  of  asymmetry  is  exponentially 
small,  so  it  is  missed  at  every  order  of  the  asymptotic  expansion. 

Even  so,  it  follows  that  the  geometric  model  has  no  needle-crystal 
solutions  for  small  surface  tension,  even  though  they  exist  to  all 
orders  in  the  asymptotic  expansion. 

(e)  If  one  adds  to  the  model  a  second  parameter  ("crystalline 
anisotropy"),  then  for  each  value  of  that  parameter  the  model  admits 
a  needle  crystal  only  for  a  discrete  set  of  values  of  the  Peclet 
number. 

Some  of  these  results  were  also  obtained  by  others,  using 
different  means  of  analysis  [19,  20).  From  our  standpoint,  the  main 
conclusion  of  all  of  these  analyses  is  that  the  question  of  whether  the 
geometric  model  even  has  a  needle-crystal  solution  cannot  be 
decided  without  going  beyond  all  orders  in  the  asymptotic  expansion. 

In  more  recent  work  [21]  it  has  been  claimed  that  a  similar 
situation  occurs  in  the  full  equations  for  needle  crystals. 

Now  let  us  consider  a  third  example  in  which  asymptotics 
beyond  all  orders  plays  a  decisive  role.  In  one  spatial  dimension,  a 
Klein-Gordon  equation  is  a  partial  differential  equation  of  the  form: 

u*t "  u**=  8(0)  =  0,  g’(0)  >  0. 


In  the  usual  linear  equation,  g(u)  =  mu,  where  m  represents  "mass". 
Out  of  all  possible  nonlinear  equations,  two  that  have  been  studied 
extensively  are  the  sine-Gordon  equation  with  g(u)  =  sin  u,  and  the' 
<|>4  -model  with  g(u)  =  2u-3u*  +u3  .  The  latter  name  comes  from 
setting  u  =  $  +  1,  after  which  the  Lagrangean  density  for  this  model 
differs  from  that  for  the  linear  model  by  a  term  (  <f>4  ). 

A  "breather"  is  defined  to  be  a  real-valued  solution  of  a 
nonlinear  Klein-Gordon  equation  that  is  localized  in  space  and 
periodic  in  time,  with  a  nontrivial  period.  If  one  thinks  of  the  Klein- 
Gordon  equation  as  a  classical  model  of  a  field  theory  in  one 
dimension,  then  any  localized  solution  might  represent  an 
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elementary  particle  and  a  breather  might  represent  a  particle  with 
an  internal  degree  of  freedom.  Breathers  have  physical  importance 
if  they  exist. 

Breathers  are  known  to  exist  for  the  sine-Gordon  equation  [22]: 

u(x,t)  =  4  arctan  {  [co/  * J 1  ]  sech V 1  -cor  x-  sin  cot  } 

The  question  is  whether  they  exist  for  any  other  Klein-Gordon 
equations.  For  small  amplitudes  (i.e.,  u«l),  the  sine-Gordon  and  the 
-models  approximate  each  other,  so  one  expects  the  4>*  -model  to 
admit  at  least  an  approximate  breather  solution  for  small  amplitudes. 
It  turns  out  that  the  -model  admits  an  asymptotic  expansion  for  a 
breather  in  a  small  amplitude  limit,  and  that  this  expansion  can  be 
carried  to  all  orders  without  developing  any  secular  terms.  This 
approximate  breather  was  used  by  Dashen,  Hasslacher  and  Neveu 
[23]  in  their  quantization  of  4>* .  The  question  of  whether  the 
expansion  represents  a  true  breather  solution  was  not  addressed  by 
them. 

Segur  and  Kruskal  [24]  showed  the  <J>4  -model  admits  no  true 
breathers  in  this  limit.  The  asymptotic  expansion  does  represent 
true  solutions  to  the  equation,  but  none  of  them  are  both  localized  in 
space  and  periodic  in  time.  Typically,  these  solutions  radiate  energy 
away,  but  at  a  rate  that  is  exponentially  small,  and  that  is  missed  by 
the  asymptotic  expansion  even  when  carried  to  all  orders. 
Nevertheless,  this  exponentially  small  radiation  rate  is  enough  to 
carry  away  all  of  the  energy  eventually,  so  eventually  the  breather 
disappears. 

A  final  example  involves  an  ideal  pendulum  under  the 
influence  of  small,  periodic  forcing  [25] .  It  is  known  that  a  small, 
periodic  forcing  at  moderate  frequency  of  a  pendulum  typically 
destroys  the  integrability  of  the  problem  and  introduces  chaotic 
trajectories  of  the  pendulum.  The  concrete  evidence  of 
nonintegrability  (the  Melnikov  integral)  vanishes  to  all  orders  in  the 
high-frequency  limit,  but  Holmes,  Marsden  and  Scheurle  [25]  showed 
by  evaluating  exponentially  small  terms  that  the  forcing  destroys 
integrability  in  this  limit  as  well. 

Perhaps  it  is  appropriate  to  conclude  this  survey  with  two 
general  remarks.  The  first  is  that  all  of  the  problems  mentioned  here 
are  pathological,  in  the  sense  that  it  is  rare  for  an  asymptotic 
expansion  to  yield  no  information  at  any  order.  The  existence  of 
these  pathological  examples  does  not  mean  that  no  asymptotic 
expansion  should  be  trusted,  but  rather  that  they  must  be 
interpreted  correctly.  The  second  remark  is  that  even  though  these 


examples  are  pathological,  they  are  not  unphysical.  Each  came  from 
a  real-world  problem  of  physical  interest.  Pathological  problems  do 
arise  in  physical  contexts,  but  only  occasionally. 
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COMPUTATIONAL  ISSUES  IN  GOAL  PROGRAMMING 
RESOURCE  ALLOCATION 

Leon  Medler 

U.S.  Army  Troop  Support  Command 
Belvoir  RD&E  Center 
Fort  Belvoir,  VA  22060-5606 


1.  EXECUTIVE  SUMMARY 

The  US  Army  Belvoir  RD&E  Center  has  been  engaged  in  a 
program  of  research  aimed  at  developing  an  efficient  and  fair 
methodology  for  ranking  proposed  RD&E  programs.  A  linear  goal 
programming  model  (LGPM)  has  been  the  foundation  of  this  method¬ 
ology.  Problems  with  this  approach  have  surfaced  in  two  areas: 
(1)  excessive  computer  time  and  (2)  anomalous  results.  This 
paper  reports  on  research  conducted  in  order  to  redress  these 
problems.  Analysis  of  the  existing  model  indicated  that  time 
expenditure  in  sensitivity  excursions  was  the  major  culprit. 
Sensitivity  analysis  was  being  conducted  by  decrementing  resource 
constraints  and  rerunning  the  LGPM  "from  scratch."  Since 
virtually  as  much  computation  was  involved  in  the  reruns  as  in 
the  initial  run,  significant  computation  expense  was  being 
incurred.  Therefore,  we  modified  the  LGPM  to  start  from  the  last 
solution  (i.e.,  the  last  LGPM  simplex  tableau).  This  required 
the  addition  of  a  dual  simplex  algorithm  to  supplement  the 
regular  simplex  algorithm,  since  the  last  solution  becomes 
infeasible  under  certain  resource  constraint  changes.  For  a  110 
project  prioritization  problem  involving  ten  resource  levels  the 
improved  LGPM  requires  only  10-20%  as  many  simplex  iterations  as 
did  the  old  LGPM. 

2.  THEORY  OF  IMPLEMENTATION 

The  following  discussion  assumes  some  familiarity  with  the 
fundamentals  of  linear  programming  and  linear  goal  programming, 
as  might  be  found  in  Ignizio  [1982].  We  use  the  notation  from 
this  source,  largely,  a-nd  it  tends  to  be  standard  in  the  linear 
programming  literature.  We  will  use  []  to  denote  matrices  and 
arrays,  and  []T  to  denote  the  transpose  of  a  matrix  or  vector. 

The  following  abbreviations  will  be  used: 

IBFS  -  initial  basic  feasible  solution 

LGPM  -  linear  goal  programming  model 

LP  -  linear  program 


RHS  -  right  hand  3ide. 
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In  order  to'  explain  the  general  method  for  forcing  a  LGPM  to 
begin  at  a  prescribed  solution,  it  will  be  easiest  to  explain  the 
method  for  the  simplest  sort  of  LP.  The  generalization  to  the 
more  complex  preemptive  LGPM  is  straightforward.  Therefore, 
consider  the  following  simple  LP  problem: 

(1)  Minimze  z  *  CclCxlT 

a.  t .  CA)  Cx3  T  a  £b)  T 

tx]T  >=  £01 T  , 


Assume  that  this  problem  is  such  that  an  IBFS  can  be  found  by 
using  simple  slack  variables.  This  corresponds  to  rewriting  the 
problem  as: 

(2)  Minimize  z  «  Cc,03Cx,alT 

a.t.  CA,I]£x,a]T  *  CblT 
Cx.alT  >*  C0]T 


In  these  last  two  formulations  A  is  an  m  x  n  matrix,  c  and  x  are 
1  x  n  matrices  (vectors) ,  a  and  b  are  both  1  x  m  matrices 


(vectors),  I  is  an  m  x  m  matrix,  and  0  is  either  1  x  n  or  1  x 


<n+m),  as  appropriate  in  context. 


The  LP  problem  is  then  solved  using  the  simplex  algorithm, 
which  begins  by  operating  on  the  following  initial  extended 
tableau,  corresponding  to  using  the  slack  variables  as  the 


initial  basis: 


t'll 


O  I  sm  I  am»l  a»* 2  •••  ®n»n  0  O  •••  1  •  b* 

Indicators- > I zi -ci  22-C2 • . • 2m-cm  zn*l  zn-2  -  zn*m  >  z 


The  simplex  algorithm  then  proceeds,  with  the  status  at  any 
particular  iteration  represented  by  an  extended  tableau  of  the 
general  form: 


cj  ' s->  1 

Cl 
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0 

0 
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1 BV(xB) 1 

XI 
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y2*2  ...  y2  *  n 

y2  *  n  + 1 
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The  rtodar  will  recall  that  at  any  point  in  time  tha  yi,3'» 
and  xBi'a  may  ba  axpreaaad  in  terms  of  tha  m  x  m  matrix  CBJ 


dafinad  aa 


[B3  *  C  a  OxBj.  >  a(xB2> 


a(xBm>  3 


whara  a<xB3>  denotaa  tha  column  in  tha  initial  tablaau 
corraapondlng  to  tha  baaic  variabla  now  labalad  xBj.  Tha 
important  relationahipa  actually  involva  tha  inveraa  of  [B] : 


CxBl T  .  CBl-ltbjT 


[y  3  3  T  «  £B3'lta33T 


whara  Cb3T  ia  tha  initial  right  hand  aide  (RHS) . 


Onca  thaaa  are  computed,  tha  indicator  row- elements  may  be 


computed  aa 


z  ■  IcBJ CxBJ T 


Z3  *  fcBl  Cy3  ]  T . 


Thua,  given  the  baaia  <xBi,  xB2,  . ..  xBm),  together  with 
CA] ,  C b 3  ,  and  Cc]  from  the  original  problem,  we  can  identify  CB1  , 


find  its  inverse,  and  then  calculate  all  the  elements  of  the 


tableau  needed  to  start  the  simplex  procedure  from  that  point. 
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3.  CONCLUSIONS 


Several  conclusions  can  now  be  drawn. 

(1)  Note  that  to  construct  B  we  only  need  the 
identification  of  the  basis  variables,  not  their  values. 

(2)  If  any  series  of  Gauss-Jordan  operations  leads  from  the 
initial  tableau  to  the  tableau  with  basis  xg,  then  in  the  final 
tableau  the  important  matrix  [B]“l  actually  appears,  viz.,  as  the 
array  occupying  the  columns  under  the  slack  variables: 

I  ^1 »  n+1  ...  n+m  I 


j  ^m  >n+l  .  .  .  ^m»  n+m  | 

This  observation  is  the  key  to  efficient  sensitivity  analysis, 
and  is  called  "A  Fundamental  Insight”  by  Hilller  and  Lieberman 
[1980]. 

(3)  Recall  from  elementary  linear  algebra  that  the  inverse 

of  [B]  may  be  calculated  by  forming  the  rectangular  matrix  [ B | X ] 
and  conducting  appropriate  Gauss-Jordan  operations  until  the 
identity  matrix  appears  on  the  left:  f I | D ] .  At  that  point  the 

square  matrix  on  the  right  is  D  »  [ B  ] — 1  . 

(4)  The  simplex  procedure  is  simply  a  sequence  of  Gauss- 
Jordan  operations,  guided  by  rules  for  selecting  the  pivot  point 

(5)  The  yi,j*s  and  xBi's  computed  using  [B]-*  in  the 
formulas  above  are  identical  to  those  that  would  be  obtained 
through  a  series  of  simplex  operations  yielding  the  same  basis 
vector. 

Taken  together,  the  above  suggests  a  framework  for  a 
simple  and  efficient  technique  to  construct  the  simplex  tableau 
appropriate  to  the  desired  initial  basis: 

Employ  the  basic  simplex  algorithm,  but  override  the  normal 
selection  of  pivot  point  (entering  and  exiting  variables), 
instead  forcing  the  desired  variables  into  the  basis  and 
prohibiting  their  exit. 
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ABSTRACT .  A  feeling-thinking  machine  has  been  designed  using  the  mammalian 
brain  as  a  model  and  current  psychobiology  concepts  as  a  guide.  The  machine  has 
been  successfully  run  as  a  computer  simulation.  It  mimics  a  primitive  organism 
with  eight  functional  brain  centers.  They  are  the  recticular  ascending 
substance  (RAS),  the  amygdala,  the  cingulate  gyrus,  and  medial  forebrain  bundle, 
the  hippocampus,  thalamus,  hypothalamus,  and  the  neocortex. 

I.  INTRODUCTION.  Machine  intelligence  for  autonomous  systems  must  be 
capable  of  learning  and  especially  thinking,  if  we  are  to  go  beyond  the  'islands 
of  autonomy'  presently  envisioned  for  teleoperated  and  remotely  piloted 
vehicles.  One  approach  is  to  use  the  mammalian  brain  as  a  model  and  investigate 
the  possibility  of  duplicating  its  functions  in  electronic  circuitry.  This 
extension  of  neural  network  design  is  called  non-living  intelligence  (NLI). 

The  brain  consists  of  approximately  10**12  neurons  intricately  intercon¬ 
nected.  Only  a  small  part  of  this  circuitry  has  been  unravelled.  The  NLI 
effort  at  Benet  Labs  does  not  describe  how  the  brain  works,  but  involves  elec¬ 
tronic  and  computational  experiments  that  provide  insight  into  how  the  brain 
might  work.  We  pursue  NLI  through  the  design  and  construction  of  feeling¬ 
thinking  machines.  Feeling  is  essential  because  without  motivation,  there  is 
nothing.  The  machine  must  want  to  do  things.  In  doing  things,  it  will  learn; 
and  having  learned,  it  will  think.  This  report  does  not  describe  machines  that 
"exhibit  intelligent  behavior";  but  rather  machines  that  feel,  want,  and  think. 
The  distant  goal  is  to  create  a  machine  that  thinks  and  acts  like  a  man.  This 
report  discusses  the  first  of  a  series  of  feeling-thinking  machine  designs. 

II.  THE  MODEL.  Our  approach  to  designing  this  machine  is  to  simulate  a 
primitive  organism  which  must  survive  within  a  contrived  universe.  We  have 
given  it  the  name  "Pacrat".  Pacrat's  brain  has  eight  brain  centers.  The 
electrical  activity  of  these  neural  centers  is  not  modeled,  only  the  functional 
relationships.  From  these  interactions  arises  a  sophisticated  structure  which 
rests  upon  the  anatomy  of  Pacrat's  brain.  The  neural  centers  modeled  are:  the 
reticular  ascending  substance  (RAS),  the  thalamus,  the  hypothalamus,  the 
amygdala,  the  cingulate  gyrus,  the  medial  forebrain  bundle,  the  hippocampus,  and 
the  isocortex.  (Gregory,  1975,  pp.  688-689) 

Individual  neural  response  is  not  simulated,  only  the  activity  of  assem¬ 
blages  of  neurons  called  codons.  A  codon  is  the  result  or  record  of  an 
experience.  It  exists  as  the  altered  synapses  between  the  neurons  which  consti¬ 
tute  the  assemblage.  (Palm,  1982,  1986) 


V 


Pacrat,  in  diagrammatic  form,  is  shown  in  figure  1.  It  shows  that  he  has 
been  provided  with  the  ability  to  get  about  in  his  universe  through  four  motor 
neurons.  These  are  driven  by  the  motor  area  of  the  isocortex  as  a  final  result 
of  sensory  input,  channeled  to  the  isocortex  under  the  control  of  the  thalamus, 
and  filtered  through  the  isocortex  under  the  influence  of  the  prevailing  emo¬ 
tion. 


Hunger  is  the  level  of  neural  activity  in  an  area  of  the  hypothalamus  which 
we  will  call  the  hunger  center.  (Kissin,  1986,  p.  15)  The  model  assumes  there 
are  sensory  neurons  lining  the  stomach  wall  that  respond  to  expansions  and 
contractions  of  the  stomach.  These  determine  the  level  of  activity  of  the 
hunger  center.  As  the  stomach  empties,  the  hunger  center  becomes  more  active: 
as  the  stomach  fills,  it  becomes  less  active. 

Anger  is  also  a  level  of  neural  activity  in  an  area  of  the  hypothalamus, 
but  in  this  case  the  cause  is  the  activation  of  certain  codons  in  the  isocortex 
as  mediated  through  the  amygdala.  When  this  area  is  active,  Pacrat  experiences 
some  level  of  anger  or  frustration.  The  activity  in  the  amygdala  is  quickly 
inhibited  by  eating.  (Flynn,  et  al.,  1970)  (LeDoux,  1986,  p.  342) 

Fear  is  the  level  of  activity  of  the  cingulate  gyrus.  This  activity  is, 
subjectively,  unease  escalating  to  terror.  In  Pacrat,  it  is  assumed  that  sen¬ 
sory  neurons  excite  the  cingulate  gyrus  whenever  his  back  is  uncovered.  This  is 
agoraphobia,  the  fear  of  open  places. 

Curiosity  is  the  level  of  activity  of  the  hippocampus.  It  is  set  off  by 
the  activation  of  a  codon  in  the  isocortex  which  has  not  previously  been 
excited.  The  continual  excitation  of  "old"  codons  will  allow  this  activity  to 
fade  away.  Pacrat's  hippocampus  has  efferents  on  his  motor  area  with  the  result 
that  "newness"  leads  to  exploratory  rather  than  hunger  or  fear  driven  activity. 

All  sensory  input  (other  than  olfactory)  is  gated  through  the  thalamus  to 
the  isocortex.  Thus  the  thalamus  can  relay  or  block  this  input.  It  can  also 
inhibit  the  motor  output  that  would  normally  result  from  activity  in  the  isocor¬ 
tex.  The  thalamus  does  this  in  a  rhythmic  manner  when  the  reticular  ascending 
substance  (RAS)  is  stimulated.  The  RAS  is  excited  whenever  the  hypothalamus  or 
the  cingulate  gyrus  is  active. 

The  thalamus  extends  this  period  of  choking  off  sensory  input  when  it 
receives  impulses  from  a  codon  through  synapses  which  have  been  facilitated  in 
the  past  by  the  reward-punishment  mechanism.  This  blocking  of  sensory  input  and 
an  associated  inhibition  of  motor  output  is  the  function  of  the  thalamic  reticu¬ 
lar  complex.  On  the  other  hand,  if  a  codon  is  activated  which  has  a  facilitated 
synapse  on  the  "goal"  area  of  the  thalamus  (cf.  akinetic  mutism,  Girvin,  1975), 
sensory  input  is  gated  to  the  isocortex  and  the  motor  output  is  enabled. 

The  normal  activity  of  the  isocortex  is  association.  During  each  "moment" 
there  is  an  active  codon  which  has  efferents  on  the  motor  output  system.  If 
this  system  is  not  inhibited,  motor  output  will  follow.  This  codon  fades  out  as 
its  store  of  strategic  molecules  becomes  temporarily  depleted.  As  it  fades  out 
another  codon  starts  up  and  the  next  "moment"  begins.  The  new  codon  is  deter¬ 
mined  by  the  sensory  input  (if  not  blocked),  the  previously  excited  codon,  and 
the  current  dominant  emotion. 


A  reward-punishment  mechanism  is  started  up  by  the  medial  forebrain  bundle 
whenever  activity  in  the  hypothalamus  or  cingulate  gyrus  is  reduced.  The  role 
of  this  mechanism  is  to  facilitate  all  recently  fired  synapses.  (LeDoux,  1986) 


III.  THE  IMPLEMENTATION.  Pacrat  exists  in  a  contrived  universe:  a  very 
simple  universe  which  is  seen  as  partitioned  by  a  rectangular  grid  (figure  2). 

At  each  location  in  the  grid  one  of  Pacrat 's  sensory  neurons,  unique  to  that 
location,  becomes  active.  This  gives  him  a  location  sense.  At  genesis  he  does 
not  know  where  one  location  is  relative  to  another,  but  he  does  know  that  he  is 
where  he  is.  He  has  also  been  given  the  ability  to  sense  his  own  trail,  and  has 
a  general  adversion  to  going  where  he  has  recently  been.  Again,  the  individual 
activity  of  the  sensory  neurons  is  not  simulated,  only  the  relationship  with 
other  active  neurons.  His  burrow  (or  starting  point)  is  always  at  row  11, 
column  1.  This  is  indicated  by  shading  that  cell.  Paqrat's  current  location  is 
given  by  highlighting  the  cell  he  is  in.  (Walter,  1950,  1951) 

One  codon  is  active  at  any  time  and  this  represents  a  'moment'  in  Pacrat 's 
life.  This  codon  is  excited  by  current  sensory  input  to  the  isocortex  plus  the 
previously  excited  codon  and  the  prevailing  emotion.  The  inputs  are  the  loca¬ 
tion  sense,  which  is  gated  through  the  thalamus  (figure  1),  smell,  and  the 
axonal  bundles  from  the  hypothalamus,  and  cingulate  gyrus.  A  codon  in  the  simu¬ 
lation  is  simply  a  vector  of  scalars  representing  the  current  sensory  input  (if 
any),  normalized  synaptic  weights  to  the  four  motor  neurons,  associative  connec¬ 
tions  to  other  codons,  dominant  emotion,  and  synaptic  weights  to  the  amygdala, 
thalamus,  and  hippocampus.  (Mishkin,  et  al.,  1987) 

Pacrat 's  motivation  is  hunger  and  fear.  When  awake,  Pacrat  is  forced  to 
move  by  one  or  the  other,  or  else  he  just  goes  to  sleep  as  the  RAS  quiets  down. 
(Kissin,  1986,  Chap  2).  Initially  this  drive  is  hunger.  In  the  simulation,  the 
distension  of  the  stomach  is  represented  by  a  scalar.  This  number  continually 
decreases  unless  Pacrat  is  at  a  food  spot  and  is  eating.  When  this  number  is 
low  enough,  the  hypothalamus  responds  (again  a  scalar)  and  the  RAS  is  excited. 
Pacrat  wakes  up.  He  is  forced  to  explore  his  universe  for  food  to  satisfy 
hunger.  Food  is  placed  randomly  in  one  of  three  locations.  The  three  potential 
food  spots  are  highlighted  on  the  right  side  of  the  grid,  with  food  located  in 
one  of  the  cells.  When  he  reaches  a  food  spot,  he  eats,  his  stomach  fills  up, 
and  the  activity  in  the  hypothalamus  is  significantly  reduced.  This  is  simu¬ 
lated  by  simply  increasing  the  number  corresponding  to  distension  of  the  stomach 
which  is  sensed  by  the  neurons  lining  the  stomach  wall.  These  neurons  have 
efferents  to  the  hypothalamus. 

Pacrat  can  move  north,  south,  east,  and  west  within  the  boundaries  of  his 
universe.  Motor  neurons  drive  Pacrat  in  one  of  these  directions  one  cell  at  a 
time.  Each  active  codon  in  the  isocortex  has  efferents  on  each  of  these  motor 
neurons  and  the  relative  effectiveness  of  these  efferents  determines  the  direc¬ 
tion  of  travel.  At  the  outset,  i.e.  trial  1,  there  is  no  preferred  direction  of 
movement.  The  synaptic  weights  from  any  given  codon  to  the  motor  neurons  are 
identical.  Pacrat  moves  about  his  universe  randomly  until  food  is  found.  When 
it  is,  a  reward  mechanism  is  activated  through  the  medial  forebrain  bundle  which 
facilates  all  recently  fired  synapses.  This  is  learning  and  will  generate  a 
preferred  direction  of  movement  when  similar  codons  are  active  in  the  future. 

The  vectors  representing  codons  are  changed  so  that  elements  corresponding  to 
synaptic  connections  between  simultaneously  active  neurons  are  increased. 
Facilitation  is  proportionally  lower  for  codons  active  earlier  in  time. 
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After  hunger  is  satiated  and  the  level  of  activity  of  the  hypothalamus 
reduced,  fear  is  no  longer  masked  by  hunger.  Fear  keeps  the  activity  of  the  RAS 
high.  An  active  cingulate  gyrus  drives  Pacrat  back  to  his  burrow.  Again,  if 
this  is  the  first  trial,  there  is  no  preferred  direction,  but  the  codons  which 
are  activated  are  those  associated  with  fear  rather  than  hunger.  The  neurons  in 
the  cingulate  gyrus,  not  the  hypothalamus,  excite  neurons  in  the  isocortex. 

When  his  burrow  is  reached,  Pacrat’s  back  is  covered.  The  activity  of  the 
cingulate  gyrus  is  abruptly  decreased.  The  reward  mechanism  is  again  activated 
through  the  medial  forebrain  bundle  and  recently  fired  synapses  are  facilitated. 
This  will  generate  biased  movement  in  the  future  if  these  codons  are  active. 
Henceforth,  at  any  cell  in  the  grid,  he  will  tend  to  go  in  a  direction  depending 
on  which  neurons  are  active  in  the  brain  centers.  An  active  hypothalamus  may 
move  him  east,  an  active  cingulate  gyrus  with  the  same  sensory  input  may  drive 
him  north. 

During  epigenisis,  Pacrat  learns  to  survive.  Randomness  forces  Pacrat  out 
of  obsessive  behavior  patterns.  Although  a  reward  mechanism  may  increase  synap¬ 
tic  strength  between  a  codon  and  a  given  motor  neuron,  there  is  always  a  chance 
Pacrat  will  move  in  a  different  direction.  A  built  in  random  element  raises  the 
level  of  activity  of  motor  neurons  with  a  lower  synaptic  weight  to  the  currently 
active  codon.  An  active  hippocampus  increases  the  effect  of  this  random  ele¬ 
ment.  If  this  were  not  present,  Pacrat  would  not  survive.  Once  food  were 
found,  he  would  follow  the  same  path  again  and  again.  However,  as  the  synaptic 
strength  between  a  codon  and  motor  neuron  is  increased,  it  becomes  more  and  more 
difficult  for  Pacrat  to  alter  his  behavior.  He  will  continue  searching  for  food 
in  locations  where  it  does  not  exist.  To  resolve  this  we  have  given  Pacrat  an 
amygdala.  An  active  amygdala  mediates  anger.  Excited  neurons  in  the  amygdala 
.generate  unique  active  codons  in  the  isocortex.  Figure  1  shows  the  role  of  the 
amygdala  in  Pacrat.  When  the  reward  mechanism  is  active,  all  recently  fired 
synapses  are  facilitated  through  the  medial  forebrain  bundle.  These  include 
synapses  from  the  active  codon  in  the  isortex  to  the  amygdala.  Therefore,  if 
this  same  codon  is  excited  in  future  trials,  the  amygdala  also  becomes  highly 
active.  This  high  level  of  activity  excites  a  region  in  the  hypothalamus  asso¬ 
ciated  with  anger  or  frustration.  Unless  there  is  a  concurrent  good  experience, 
such  as  eating,  which  will  inhibit  the  amygdala;  Pacrat  will  become  angry.  In 
other  words,  he  gets  mad  when  food  is  not  where  it  is  supposed  to  be.  This  anger 
quickly  drives  Pacrat  out  of  the  vicinity  of  a  food  spot  by  exciting  different 
codons  in  the  isocortex.  These  codons  do  not  have  synaptic  weights  to  the 
motor  area  that  favor  any  given  direction.  He  is  effectively  'bounced'  randomly 
to  neighboring  locations  in  the  grid.  Without  the  amygdala,  Pacrat  would  keep 
looking  for  food  in  the  same  spot  almost  indefinitely.  The  level  of  activity  in 
the  hypothalamus  is  far  greater  than  that  of  the  hippocampus.  When  he  is 
starving,  he  doesn't  get  bored. 

The  effect  of  the  rhythmic  action  of  the  thalamus  is  that  a  moment  (active 
codon  n)  that  generates  motor  output  is  followed  by  several  moments  (active  . 
codons  n+1,  n+2,...)  with  motor  output  inhibited.  This  is  the  first  of  three 
forms  that  thinking  takes.  The  sensory  input  is  temporarily  blocked,  motor  out¬ 
put  inhibited,  and  associated  codons  in  the  isocortex  are  turned  on.  This  form 
of  thinking  is  implemented  in  Pacrat  as  he  effectively  evaluates  the  consequen¬ 
ces  of  his  last  move. 


A  second  form  of  thinking  comes  about  when  an  excited  thalamus  results  in 
an  extended  period  of  blocking  of  sensory  input.  Again,  the  normal  state  in  the 
isocortex  is  association  so  codons  continue  to  fade  in  and  out.  If  the  chain  of 
associating  codons  reaches  a  codon  with  inhibitory  efferents  on  the  thalamus, 
activity  of  neurons  in  the  reticular  complex  is  reduced.  The  blocking  cycle  of 
sensory  input  is  reduced  to  a  minimum  and  Pacrat  proceeds  to  move  with  intent. 
This  form  of  thinking  is  recognition.  It  is  initiated  when  Pacrat  moves  to  an 
area  of  particular  interest  to  him  on  the  grid.  This  a  location  where  synapses 
from  the  isocortex  to  the  thalamus  have  been  facilitated  from  previous  rewards. 

A  third  form  of  thinking  comes  about  when  the  extended  period  of  blocked 
sensory  input  and  inhibited  motor  output  results  in  slightly  different  asso¬ 
ciated  codon  chains.  This  can  occur  because  of  the  inherent  randomness  of 
neural  actions.  If  one  of  these  chains  results  in  activating  a  codon  quicker 
than  recent  paths  have  done,  the  neurons  of  this  codon  are  in  a  different  state 
of  molecular  depletion.  It  has  had  less  time  to  recover  from  the  last  activa¬ 
tion.  It  comes  on  with  a  burble  which  is  transmitted  to  the  reward  system,  and 
recently  fired  synapses  are  facilitated.  This  is  insight  and  is  the  basic  mecha¬ 
nism  of  rational  thought.  Pacrat  has  demonstrated  this  by  "thinking"  of  more 
efficient  paths  to  food. 


IV.  TRIAL  RUN.  Figure  3  shows  four  static  displays  of  a  typical  trial  run 
of  the  simulation.  Figures  3a  -  3d  are  snapshots  in  trial  702.  The  activity  of 
the  brain  centers  is  given  by  a  bar  chart  on  the  left  side  of  the  display.  The 
larger  the  bar,  the  more  active  that  area  of  the  brain.  Even  number  trials 
(i.e.,  702)  display  the  effect  of  a  particular  neural  center  (i.e.,  hunger, 
anger)  while  odd  number  trials  (i.e.,  703)  give  the  name  of  the  center.  The 
number  of  steps  indicate  the  sequence  of  each  snapshot  in  the  trial. 

In  figure  3a  Pacrat  has  just  left  his  burrow,  the  starting  point.  The  acti¬ 
vity  of  the  hypothalamus  was  high  enough  to  activate  the  RAS  and  wake  him  up. 
Figures  3a-c  show  Pacrat  driven  by  hunger.  Through  epigenisis,  which  is  his 
previous  701  trials,  he  has  learned.  Figure  3b  shows  Pacrat  with  his  motor  out¬ 
put  inhibited  by  the  thalamus.  This  is  shown  by  freezing  him  at  his  current 
location  and  dynamically  displaying  his  codon  association  chain.  The  reward 
cell  for  this  trial  is  in  the  middle  of  the  three  possible  food  locations  (row 
11,  column  18).  From  past  experience,  food  has  been  known  to  be  located  in  the 
last  reward  cell  (row  19,  column  18).  Pacrats  codon  chain  eventually  associates 
to  this  location  and  facilitated  synapses  from  the  isocortex  stop  the  thalamus 
from  blocking  sensory  input  and  inhibiting  the  motor  area.  His  motor  output  is 
no  longer  inhibited.  Figure  3c  shows  the  active  amygdala  when  food  is  not  found 
where  it  was  expected. 


His  frustration  forces  him  out  of  the  vicinity  of  the  empty  food  cell  and 
eventually  he  locates  the  food.  Figure  3d  shows  Pacrat  moving  with  intent  back 
to  his  burrow,  driven  by  fear.  The  activity  in  the  thalamus  (thinking)  indica¬ 
tes  it  has  been  inhibited  from  blocking  sensory  input  and  from  inhibiting  motor 
output.  This  is  a  result  of  recognition  resulting  in  strong  inhibitory  input 
from  the  isocortex. 


V.  IMPLEMENTING  PACRAT  AS  A  NEURAL  NET.  A  simplified  version  of  Pacrat 
has  been  implemented  using  only  simulated  neurons,  formal  definitions  of  neural 
activity,  and  synaptic  facilitation.  Neural  activity  is  modeled  using  PID 
(proportional-integral-differential)  control.  The  governing  equations  for  cell 
activity  and  synaptic  facilitation  are  given  in  figure  4.  Synaptic  facilitation 
has  both  a  Hebbian  (associative)  and  a  non-Hebbian  (reward-punishment)  com¬ 
ponent.  A  selectable  resting  frequency  and  codon  saturation  frequency  were 
included  to  help  with  governance  of  the  network. 

This  simulation  is  called  Mouse.  Figure  5  gives  a  static  display  at  one 
point  in  the  simulation.  The  shaded  circles  on  the  left  represent  neurons. 

Mouse,  like  Pacrat,  lives  in  a  bounded  universe.  This  universe  is  the  ten  by 
ten  grid  on  the  right.  At  each  cell  in  the  grid,  a  single  sensory  neuron 
becomes  active.  The  color  of  the  circles  reflect  the  activity.  The  color 
changes  gradually  from  blue  to  red  to  white  as  the  activity  increases.  Since 
color  is  not  reproduced  in  this  report,  the  active  neurons  are  circled.  Each 
sensory  neuron  has  excitatory  efferents  on  each  of  four  motor  neurons.  These 
motor  neurons  are  labelled  N  (north),  S  (south),  E  (east),  and  W  (west).  When 
the  activity  of  one  of  these  motor  neurons  exceeds  a  preset  threshhold.  Mouse 
moves  one  cell  in  that  direction  (within  the  boundaries)  and  a  different  sensory 
neuron  becomes  excited.  As  in  Pacrat,  there  are  three  potential  reward  cells. 

At  the  beginning  of  each  trial,  food  is  placed  randomly  in  one  of  these  cells. 
These  cells  are  the  three  shaded  cells  in  column  9  as  shown  in  figure  5.  The 
dark  cell  gives  the  location  of  the  reward  cell  for  that  trial.  When  Mouse 
reaches  a  cell  where  food  is  located,  a  reward  mechanism  is  activated  and 
recently  fired  synapses  are  facilitated.  The  normalized  synaptic  weights  from 
the  sensory  neurons  to  the  motor  neurons  are  shown  by  the  arrows  in  the  grid. 
There  is  always  an  element  of  randomness  associated  with  each  move,  but  the 
larger  the  arrow  the  more  likely  Mouse  will  move  in  that  direction.  Initially, 
(i.e.,  trial  one)  Mouse  has  no  preferred  direction  of  movement  and  the  arrows 
have  zero  length  and  direction.  Figure  5  gives  the  normalized  weights  after 
1000  trials.  Mouse  always  starts  at  row  6,  column  1  and  his  current  location  in 
the  grid  is  highlighted.  In  order  to  avoid  obsessive,  compulsive  behavior  Mouse 
has  been  given  a  sense  of  smell.  He  is  designed  to  avoid  his  own  trail.  This 
is  accomplished  via  four  sensory  neurons  with  inhibitory  efferents  on  the  motor 
neurons.  These  are  labeled  1/N,  1/E,  1/S,  and  1/W  indicating  their  effect  on 
that  direction  of  travel.  The  necessity  for  these  is  evident  if  one  imagines 
four  arrows  in  the  grid  forming  a  loop. 

Figure  5  shows  Mouse  after  a  single  move.  He  has  just  moved  north  so  there 
is  a  high  level  of  neural  activity  in  the  neuron  inhibiting  motor  neuron  S 
(south).  Since  this  is  the  1000th  trial.  Mouse  has  a  preferred  direction  of 
movement.  At  this  location,  as  shown  by  the  arrow,  it  is  east.  The  large  synap¬ 
tic  weighting  from  the  currently  active  sensory  neuron  to  motor  neuron  E  (east) 
is  raising  the  activity  of  this  motor  neuron  more  than  the  others.  It  is  there¬ 
fore  likely  that  Mouse  will  move  east. 

VI.  CONCLUSIONS.  A  brassboarded  feeling-thinking  machine  is  possible.  We 
believe  it  is  not  practical  at  the  moment  to  consider  casting  everything  in 
silicon,  therefore  the  neural  network  section  of  the  machine  will  be  emulated  in 
a  highly  parallel  computer  ensemble. 
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VII.  ONGOING  EFFORTS.  The  Pacrat  simulation  is  being  completely  rewritten 
so  that  the  neurons  are  explicitly  modeled.  This  is  preparatory  to  moving  the 
simulation  to  a  transputer  network  running  under  an  Occam  harness. 
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CELL  ACTIVITY  (PIP) 

i  =  postsynaptic  ,  j  =  presynaptic 

xi  =  [AXi  +  B  /  (K-j  -  X-j  )dt  +  CX  +  D*XI  +  E*R]  + 

XI  =  XI  +  F*(XXjWij-  -  X-j) 

X  =  cell  activity 

K-j  =  resting  frequency 

W-j j  =  synaptic  weight 

A,B,C,D,E,F  =  empirical  constants 

R  =  rectangular  distribution  on  (0.0, 1.0) 

SYNAPTIC  FACILITATION 

wij  =  -AWij  +  B*H(XifXj)  +  T ( R , P ) *a*  /_  H(XifXj)dt 
a  =  Ca  +  D*H(X^,Xj) 

T(R,P)  =  E*R  +  F*P 

H(X j , X j )  =  [Xi-Ki]+[XJ-Kj] 

A,B,C,0,E,F  =  empirical  constants 
Kj.Kj  =  resting  frequencies 
R  =  instantaneous  reward  level 
P  =  instantaneous  punishment  level 
R=p=0orE=F=0=>  Hebbian 


Figure  4.  Cell  Activity  and  Synaptic  Facilitation. 


POLYNOMIAL  DEFINITION  OF  DISCRETE  FIELD 
POINT  OF  MAP  OF  DIFFUSION  EQUATION 
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ABSTRACT 


The  one  dimensional  diffusion  equation. 


3T 

at 


a2T 

ax2 


is  given  finite 


difference  expression,  transformed  to  geometric  and  then  algebraic  context, 
and  then  by  differencing,  recomposed  into  general  proposition.  Discrete 
terms  of  the  algebraic  transpositon  take  the  terminating  polynominal  form 


T  (N,P) 


A. 


(A-Bm  +  Cm 


where  the  coefficients  A,B,C  etc.,  which  turn  out  to  be  rational  expressions, 
are  analyzed  by  differencing  methods.  The  systematic  reduction  to  a  base-line 
source  reveals  a  general  behavior  pattern  re-expressed  in  compressed  tables, 
from  which  the  algebraic  form  of  any  (N,P)  term  can  be  recomposed. 


INTRODUCTION 


The  diffusion  equation  of  Physics  has  been  used  to  analyze  unsteady  heat 
transfer,  boundary  layer  velocity  distribution,  long  line  electrical  voltage 
fluctuation  and  salt-solute  penetration.  The  general  mathematical  expression 
2  2 

is  3T/3t  =  a  3  T/3x  where  particular  physical  constraints  determine  the 
context.  There  are  two  approaches  to  the  problem  statement  and  solution;  the 
one  most  widely  used  being  transformation,  and  the  other,  following  finite 
differences,  employes  variations  of  summing  averages  of  term  values 
established  by  unique  methods.  This  report  considers  an  averaging  type 
solution  in  algebraic  format.  The  final  result  consists  of  a  series  of 
discrete  polynomials  with  rational  coefficients  which  describe  the  dependent 
variable  state  at  each  time-distance  coordinate  in  the  manner  of  the  non¬ 
reflecting  Schmidt  plot. 


PROCEDURE 

Essentially,  the  differential  equation  is  given  finite  difference 
expression  which  is  transposed  first  to  geometric  and  then  to  polynomial 
algebraic  form.  The  polynomials,  representing  di SCrete  solutions  to  the 
differential  equation,  are  analyzed  by  differencing  techniques  whereby  the 
numerical  coefficients  of  common  diagonal  terms  are  found  to  be  expressible  in 
a  generalized  matrix. 

From  the  one  dimensional  partial  differential  equation 


at 


a2T 

ax2 


<i) 


where  T  is  the  dependent  variable 

t  is  the  independent  variable 

X  is  the  independent  variable 

and  a  is  a  constant. 

Heat  transfer  language  makes 
T  =  temperature, 
t  =  time, 
a.  -  distance, 
and  a  =  diffusivity. 

Application  of  the  finite  difference  procedure  gives 


1  V 


At 


A  2T 

_ 

AX2 


(2) 
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with  At  the  finite  difference  in  time, 


Ax  the  finite  difference  in  distance, 

A^T  the  time  variable  effecting  a  change  in  T, 
and  A^T  the  distance  variable  effecting  a  change  in  T. 
By  expansion  the  equation  becomes 


* 


,(t+l) 


n 


-  T 
n 


At 


-)  =  :ta- 1) 


-  2T  +  T 


n 


Ax' 


(Dill 


where  subscript  n  refers  to  the  x  increments  and  superscript  t  refers  to  the 
t  increments.  Schmidt1  developed  the  graphical  form  shown  on  Figure  1  with 
the  stepwise  linear  temperature  gradients  across  adjacent  layers  of  material. 
Since  the  change  in  internal  energy  within  a  layer  of  material  over  a  finite 
time  is  the  difference  between  the  heat  flow  in  and  heat  flow  out,  the  ^ 
corresponding  temperature  increment  becomes  a  function  of  the  ratio  Ax 
and  it  is  convenient  to  select  this  ratio  as  unity.  2aAt 


Whereby: 


At  • 


(4) 


Also,  a  geometric  simplification  results  from  defining  the  graphical 
proportions  as 


m 


_ 

Ax  +  Ax 


(5) 


Table  1  shows  the  discrete  algebraic  expressions  for  the  time-temperature  - 
distance  intersections  of  Figure  1.  Along  the  diagonals  of  Table  1  a  matched 
power  polynomial  appears  and  the  coordinate  expression  for  T  in  time  and 
distance  takes  the  general  form 


T  (N,P) 


(A- Bra  +  Cm 


2. 


4ok) 


(6) 


where  N  is  a  distance  index 
P  is  a  time  index 


See  George  P.  Sutton,  Rocket  Propulsion  Elements,  John  Wiley  &  Sons,  New 
York  1956. 
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,  the  external  denominator 


h 


\  i  ii  /  * 


4. 


exponent, 


k  is  the  terminal  exponent  of  m 


i 

I 

I 

I 


j  *  (k  -  term  exponent  of  m) ,  the  individual  term  denominator  exponent, 

and  4>  *  m  (To-T^o)  with  A,B,  C  ....  numerical  coefficients  of  the  interior 
terms  of  the  equation. 

To  establish  T  (N,  P)  for  any  and  m  (which  include  the  physical  constraints) 
it  is  necessary  to  establish  the  precise  values  of  the  coefficients  A,  B,  C 
....;  and  this  is  the  object  of  the  current  investigation. 

Any  full  expression,  T  (N,P),  can  be  developed  from  a  Gregory-Newton 
formulation  of  the  separate  terms  A,  B,  C,  for  the  bounded  diffusion  equation 
as  shown  in  Figure  2.  Some  interesting  progressions  do  result  but  an  alternate 
and  more  geometric  presentation  is  available  from  the  direct  finite  difference 
tables . 


Table  2  is  also  extracted,  by  difference  equation  procedure,  from  Table  1 
and  generates  the  coefficients  of  the  constant  term,  A,  for  the  coordinate 
expression  of  distance  and  time.  The  starting  point  is  within  the  heavy  box  of 
column  7.  These  numbers,  17548,  25147,  35401,  49024  and  66868,  were  found  by 
direct  calculation  using  Figure  1  and  Table  1  and  are  the  constant  numerator 
terms  only.  Table  3  lists  the  complete  polynomial  expressions  for  these 
coordinates.  By  the  usual  differencing,  the  7th  through  the  1st  columns  are 
established.  It  is  then  possible  to  work  vertically  using  the  regression  in 
column  2,  back  to  zero;  and  then  to  complete  the  elements  of  columns  1  through 
6.  Noting  the  resulting  bias  progression  at  the  tops  of  these  columns,  the  next 
step  is  continue  diagonally  (I,  II,  III)  to  column  8  and,  using  column  7  as  a 
summation,  verify  the  vertical  sequence  of  column  8.  Columns  9,  10,  11,  etc., 
are  generated  similarly. 

Within  the  individual  frames  containing  the  coefficients  is  a  paranthecized 
pair  of  numbers  which  indicate  the  distance  and  time  coordinate.  These  indices 
run  diagonally  upwards  at  constant  distance  and  bi-sequentialiy  as  time.  Tables 
4  through  9  are  formed  by  the  same  procedure  and  extend  arbitrarily  to  the  6th 
power  of  m.  However,  a  different  sequence  appears  along  diagonals  I,  II  and  III 
according  to  the  power  of  m.  Table  10  summarizes  this  behavior  and  reveals  yet 
another  correlation,  shown  mainly  by  column  4,  from  which  the  adjacent  columns 
can  be  constructed  ad-infinitum.  The  final  coincidence  occurs  from  a  re¬ 
inspection  of  Tables  2,  4-9  where  the  digital  vertical  counting  column  (1,  2,  3, 
4,  5  ....)  conjoins  the  power  of  m  and  the  time  sequence  of  the  first  distance 
diagonal  (1,3),  (1,5),  (1,7),  etc.,  by  an  interval  of  3  in  the  counter  according 
to  Table  11. 


(2)  M.R.  Spiegel,  Theory  and  Problems  of  Finite  Differences  and  Finite 
Difference  Equations,  Schaura's  Outline  Series  in  Mathematics,  McGraw-Hill  Book 
Company,  New  York,  etc.,  1971,  p.p.  36-44. 


RESULTS 


To  write  any  terra  defining  the  dependent  variable  in  time  and  distance, 
Tables  2  through  11  are  used  to  form  the  numerical  coefficients  in  Equation  (6) 

T  (N,P)  =  - —  [  A  -  Bra  +  Cm2  -  +  *  mk] 

2  2J 

For  T  (1,15)  for  instance 
N  “  1 

P  *  15 


k 


P  -  N  -  |  sin  (P  -  N)  it  /2 


(15  -  1)  -  sin  7  tt 
2 


7 


h 


P  +  N  -  2  [sin  (P  +  N)  n/2 
2 


(15  ♦  1)  -  2  -  | sin  8  nj 
2 


j  *  (k  -  terra  exponent  of  m)  *  (7  -  term  exponent  of  ra)  , 

The  finite  difference  tables  for  the  (1,15),  j  variables  are  then 
reconstructed  (Tables  2,  4  -9)  using  Tables  10  and  11  and  the  respective 
numerators  determined. 


Whereby 

Term 

Numerator  Exponent 

Value  of  "m" 

M  j  tl 

A 

51480 

0 

7 

B 

39796 

1 

6 

C 

20264 

2 

5 

D 

7050 

3 

4 

E 

1672 

4 

3 

F 

260 

5 

2 

G 

24 

6 

1 

H 

1 

7 

0 

and 

T  (1 

,15)  “  128 

51480  39796m 

128  64 

. 

20264m' 
+  32 

“1 

260m5  24m6  7 

— - -  +  — —  -  m 


7050m3  1672m4 

16  +  8 
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TABLE  1.  (continued) 


(0/8)  (35/2  -  29/2  m  +  12/2  rn  - 
(0/8) (35/4  -  10/2  m  +  m2) 

(0/16) (47/4  -  11/2  m  +  m2) 
(0/16) (9/2  -  m) 

(0/32) (5  -  m) 


TABLE  1.  (continued) 


_ 9 _ 

(0/16) (630/16  -  325/8  m  +  95/4  m2  -  15/2  m3  +  m4) 
(0/16) (187/8  -  69/4  m  +  13/2  m2  -  m3) 

(0/32) (244/8  -  81/4  m  +  14/2  m2  -  m3) 

(0/32) (57/4  -  12/2  m  +  m2) 

(0/64) (68/4  -  13/2  m  +  m2) 

(0/64) (11/2  -  m) 

(0/128) (6  -  m) 

0/128 

0/256 
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TABLE  1.  (continued) 


10 


(0/16) (630/16  -  325/8  m  +  95/4  m2  -  15/2  m3  +  m4) 
(0/32) (874/16  -  406/8  m  +  109/4  m2  -  10/2  m3  +  m4) 
(0/32) (244/8  -  81/4  m  +  14/2  m2  -  m3) 

(0/64) (312/8  -  94/4  m  +  15/2  m2  -  m3) 

(0/64;  (,68/4  -  13/2  m  +  m2) 

(0/128) (80/4  -  14/2  m  +  m2) 


(0/32) (1386/16 

-  843/8  m  +  312/4  m  -  141/4  m  +  18/2  m' 

(0/32) (874/16  ■ 

-  406/  8  m 

+  109/4  m2  -  16/2  m3  +  m4) 

(0/64) (1186/16 

-  500/8  m  +  124/4  m2  -  17/2  m3  +  m4) 

(0/64) (312/8  - 

94/4  m  + 

15/2  m2  -  m3) 

(0/128) (392/8  ■ 

-  298/4  m 

2  3. 

+  x6/  2  m  -  m  ) 

(0/128) (60/4  - 

14/2  m  + 

m2) 

(0/256) (93/4  - 

15/2  m  + 

m2) 

(0/256) (13/2  - 

m) 

(0/512) (14/2  - 

m : 

TABLE  1.  (continued) 


_ 12 _ 

(0/32) (1386/16  -  843/8  m  +  312/4  m2  -  141/4  m3  +  18/2  m4  -  m5) 

(0/64) (1979/16  -  1093/8  m  +  374/4  m2  -  79/2  m3  +  19/2  m4  -  m5) 

(0/64) (1186/16  -  500/8  m  +  124/4  m2  -  17/2  m3  +  m4) 

(0/128) (1578/16  -  608/8  m  +  140/4  m2  -  18/2  m3  +  m4) 

(0/128) (392/8  -  298/4  m  +  16/2  m2  -  m3) 

(0/256) (485/8  -  123/4  m  +  17/2  m2  -  m3) 

(0/256) (93/4  -  15/2  m  +  m2) 

(0/512) (214/8  -  16/2  m  +  m2) 

(0/512X14/2  -  m) 

(0/1024) (15/2  -  m) 

0/1024 

0/2048 
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TABLE  1.  (continued) 


13 


(0/64) (3003/16  -  4165/16  m  +  1841/8  m2  -  532/4  m3  -  196/4  m4  +  21/2  m5 
(0/64) (1979/16  -  1093/8  m  +  374/4  m2  -  79/2  m  +  19/2  m4  -  m5) 

(0/128) (2762/16  -  1397/8  m  +  444/4  m2  -  88/2  m3  +  20/2  m4  -  m5) 

(0/128) (1578/16  -608/8  m  +  140/4  m2  -  18/2  m3  +  m4) 

(0/256) (2063/16  -  731/8  m  +  157/4  m2  -  19/2  m3  +  m4) 

(0/256) (485/8  -  123/4  m  +  17/2  m2  -  m3) 

(0/512) (592/8  -  139/4  m  +  17/2  m2  -  m3) 

(0/512) (214/8  -16/2  m  +  m2) 

(0/1024) (244/8  -  17/2  m  +  m2) 

(0/1024,(15/2  -  m) 

(0/2048) (16/2  -  m) 

0/2048 

0/4096 


6  s 

m  ) 


TABLE  1.  (continued) 


_ 14 _ _ 

(0/64) (3003/16  -  4165/16  m  +  1841/8  m2  -  532/4  m3  +  196/4  m4  -  21/2  m5  +  m6) 
(0/128) (4387/16  -  5562/16  m  +  2285/8  m2  -  310/2  m3  +  216/4  m4  -  22/2  m5  +  m6) 

(0/128) (2786/16  -  2794/16  m  +  444/4  m2  -  88/2  m3  +  20/2  m4  -  m5) 

(0/256) (7599/32  -  3525/16  m  +  1045/8  m2  -  195/4  m3  +  21/2  m4  -  m5) 

(0/256)  (2063/16  -  731/8  m  +  157/4  m2  -  19/2  m"5  +  m4) 

(0/512) (2655/16  -  870/8  m  +  175/4  m2  -  20/2  m3  +  m4) 

(0/512) (592/8  -  139/4  m  +  18/2  m2  -  m3) 

(0/1024) (714/8  -  156/4  m  +  19/2  m2  -  m3) 

(0/1024; ( 244/8  -  17/2  m  +  m2) 

(0/2048/(276/8  -  18/2  m  +  m2) 

(0/2048) (16/2  -  m) 

(0/4096) (17/2  -  m) 

0/4096 

0/8192 
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TABLE  1.  (continued) 
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(0/128) (6435/16  -  9949/16  m  +  5066/8  m2  -  3525/8  m3  +  836/4  m4  -  130/2  m5  +  24/2  m6  -  m7) 


(0/128) (4387/16  -  5562/16  m  +  2285/8  m2  -  310/2  m3  +  216/4  m4  -  22/2  m5 


6. 
m  ) 


(0/256) (25147/64  -  14649/32  m  +  5615/16  m2  -  1485/8  m3 


237/4  m  -  23/2  rn  + 


6, 

m  ) 


(0/256) (7599/32  -  3525/16  m  +  1045/8  m2  -  195/4  m3  +  21/2  m4  -  m5) 
(0/512) (10254/32  -  4395/16  m  +  1220/8  m2  -  215/4  m3  +  22/2  m4  -  m5) 


(0/512) (2655/16 


873/8  m  +  175/4  m2  -  20/2  m3  +  m4) 


(0/1024) (3369/16  -  513/4  m  +  97/2  m2  -  21/2  m3  +  m4) 
(0/1024) (714/8  -  156/4  m  +  19/2  m2  -  m3) 

(0/2048) (852/8  -  174/4  m  +  20/2  m2  -  m3) 

(0/2048)  (138/4  -  18/2  in  +  m2) 

(0/4096) (155/4  -  19/2  m  +  m2) 

(0/4096) (17/2  -  m) 

( 0/8192) ( 18/2  -  m) 

0/8192 

0/16384 


TABLE  1.  (continued) 


_ 16 _ 

(0/128) (6435/16  -  9949/16  m  +  5066/8  m2  -  3525/8  m3  +  836/4  m4  -  130/2  m5  +  24/2  m6  -  m7) 
(0/256) (76627/128  -  43947/64  m  +  25879/32  m2  -  8485/16  m3  +  1090/8  m4  -  283/4  m5  +  23/2m6-m' 
(0/512) (35401/64  -  9502/16  m  +  6835/16  m2  -  825/4  m3  +  259/4  m4  -  24/2  m5  +  m6) 

(0/512) (10254/32  -  4395/16  m  +  1220/8  m2  -  215/4  m3  +  22/2  m4  -  m5) 

(0/1024) (13623/32  -  5421/16  m  +  1414/8  m2  -  236/4  m3  +  23/8  m4  -  m5) 

(0/1024) (3369/16  -  513/4  m  +  97/2  m2  -  21/3  m3  +  m4) 

(0/2048) (4221/16  -  600/4  m  +  107/2  m2  -  22/2  m3  +  m4) 

(0/2048  )  052/8  -  174/4  m  +  20/2  m2  -  m3) 

(0/4096) (1007/8  -  193/4  m  +  21/2  m2  -  m3) 

(0/4096) (155/4  -  19/2  m  +  m2) 

(0/8192) (173/4  -  20/2  m  +  m2) 

(0/8192) (18/2  -  m) 

(0/16384) (19/2  -  m) 

0/16384 

0/32768 
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ABSTRACT 

This  paper  investigates  the  interaction  between  a  planar  shock  wave  and 
a  perturbed  contact  discontinuity.  The  interaction  is  simulated  by  a  front 
tracking  method  that  uses  a  local  steady  state  analysis  to  model  the  diffraction 
patterns  produced  by  the  collision  This  front  tracking  method  automatically 
adjusts  the  topology  of  the  tracked  interface  to  account  for  the  wave  interac¬ 
tions.  The  acceleration  of  the  contact  discontinuity  by  the  shock  wave  excites 
unstable  modes  in  the  gas  interface  that  generate  Richtmyer-Meshkov  insta¬ 
bilities. 

The  following  is  an  shortened  version  of  a  paper  submitted  to  the  SIAM 
Journal  on  Applied  Mathematics. 


1.  Introduction 

The  numerical  simulation  of  a  collision  between  a  planar  shock  wave  and  a  contact 
discontinuity  surface  is  discussed  in  this  paper.  An  important  feature  of  this  method  of  simu¬ 
lation  is  the  use  of  a  front  tracking  algorithm  that  handles  bifurcations  of  tracked  waves. 
Front  tracking  sharply  resolves  the  diffracted  wave  patterns  that  are  produced  as  the  two 
waves  collide.  It  also  gives  a  detailed  picture  of  the  growth  of  surface  instabilities  in  the  gas 
interface. 

The  initial  small  amplitude  linear  analysis  of  the  shock-contact  interaction  is  due  to 
Richtmyer  (1),  and  experimental  confirmation  was  provided  by  Meshkov,  et  al.  [2].  Thus 
this  interaction  is  usually  referred  as  the  Richtmyer-Meshkov  instability. 

Recent  calculations  by  D.  L.  Youngs  [3]  give  a  detailed  view  of  this  instability,  including 
the  large  amplitude,  late  time  regime.  Eulerian  methods  are  used  because  the  extreme  degree 
of  interface  complexity  would  lead  to  excessive  mesh  distortion  in  Lagrangian  codes.  How¬ 
ever  Eulerian  codes  tend  to  suffer  from  numerical  diffusion,  that  degrades  the  interface. 
Youngs  uses  a  volume  in  cell  Eulerian  method  [4-7]  with  the  monotonic  advection  method  of 
Van  Leer  [8,9)  to  enhance  the  interface  resolution  and  to  minimize  the  numerical  diffusion. 

There  are  two  principle  methodological  differences  between  the  computations  of 
Youngs  and  the  ones  presented  in  this  paper.  The  first  is  the  use  of  the  front  tracking  algo- 
'tthm  for  an  exact  resolution  of  the  interface,  and  the  resulting  absence  of  numerical  diffu¬ 
sion  across  the  interface.  In  (4),  Youngs  states,  "A  possible  way  of  tracking  interfaces  would 
9e  to  define  each  interface  by  a  set  of  Lagrangian  marker  particles.  However,  this  method 
becomes  logically  complicated  if  the  interfaces  become  highly  distorted  or  the  the  geometry  is 
complex."  It  is  believed  that  the  front  tracking  method  used  here  shows  that  this  problem  has 
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been  solved  for  interlaces  with  a  considerable  degree  of  complexity.  The  second  principal 
•netnodological  difference  is  the  use  by  Youngs  of  the  monotonic  advection  method  of  Van 
Leer.  The  front  tracking  algorithm  at  present  uses  a  second-order  Lax-Wendroff  method  for 
the  solution  away  from  the  tracked  interface.  However  there  is  no  inherent  incompatibility 
between  the  method  of  front  tracking  and  such  second-order  Godunov  methods  as  the  Van 
Leer  scheme  or  the  PPM  method  of  Colella  and  Woodward  [10],  indeed  upgrades  of  the 
front  tracking  algorithm  are  planned  that  will  include  such  methods. 

The  shock-contact  interaction  modeled  here  is  the  original  problem  of  Richtmver  [1],  in 
which  a  shock  wave  collides  an  interface  between  two  gases.  Youngs  on  the  other  hand 
models  the  shock  tube  experiment  of  Meshkov  et  al.  [2]  in  which  a  shock  originally  incident 
n  the  heavier  gas  collides  with  the  contact  discontinuitv  interface,  is  reflected  by  a  rigid  wall, 
and  the  reflected  shock  again  interacts  wtth  the  interface.  These  problems  are  closely  related 
and  close  qualitative  similarities  between  the  results  obtained  here  and  those  of  Young’s  are 
observed,  with  the  expected  difference  of  an  absence  of  numerical  diffusion  in  the  front 
--acking  method.  A  detailed  comparison  has  not  been  attempted  at  this  point.  A  second 
difference  is  that  in  the  published  results  of  this  paper  the  parameter  ranges  include  strong 

p 

mcident  shocks  with  pressure  ratios  across  the  shocks  up  to  =  1000  and  shock  Mach 

Pr) 

numbers  up  to  28.  The  most  nearly  comparable  of  the  figures  in  the  two  papers  is  perhaps 
figure  4.4(e)  of  this  paper  and  figure  9  frame  3  of  [3],  The  mesh  used  in  the  front  tracking 
run  is  about  4.4  times  as  coarse,  per  mode,  in  the  direction  parallel  to  the  interface  as  is  that 
of  Youngs.  In  spite  of  the  coarser  grid,  the  results  show  a  considerably  finer  level  of  detail 
at  the  interface.  This  is  not  surprising,  since  the  front  tracking  algorithm  concentrates 
numerical  power  at  the  interface.  The  resolution  of  the  solution  in  the  untracked  portion  of 
the  computational  region  will  of  course  reflect  the  relative  coarseness  of  the  grid. 

The  number  and  types  of  the  unstable  modes  that  are  observed  in  a  shock  wave  and 
contact  discontinuity  interaction  depend  on  the  incident  shock  strength,  the  initial  geometry 
of  the  two  waves  and  the  physical  properties  of  the  gases.  A  single  mode  can  be  isolated 
when  the  incident  shock  wave  is  planar  and  the  contact  discontinuity  surface  has  the  shape  of 
a  sine  curve  of  a  single  period.  More  complicated  initial  geometries  for  the  initial  gas  inter¬ 
face  can  be  used  to  study  the  interaction  between  different  unstable  modes. 


2.  The  Front  Tracking  Algorithm 

The  front  tracking  algorithm  [11-14],  is  an  adaptive  grid  method  for  the  sharp  resolu¬ 
tion  of  selected  waves  in  numerical  solutions  to  systems  of  partial  differential  equations  in 
two  space  dimensions: 

w.  *  7  f( w)  =  0.  (2.1) 

Usually  these  waves  represent  discontinuities  in  the  solution  function,  for  example,  shock 
waves  or  contact  discontinuities.  The  selected  waves  are  tracked  by  superimposing  a  set  of 
one-dimensional  curves  onto  an  underlying  rectangular  grid.  These  curves  correspond  to  the 
location  of  the  tracked  waves  at  a  given  time  and  are  dynamically  modified  as  the  solution 
evolves  in  time. 

Some  terminology  will  be  helpful.  The  basic  data  structures  are  points,  bonds,  curves, 
nodes  and  interfaces,  see  [11].  A  point  describes  a  location  in  space  and  a  bond  contains  the 
information  needed  to  describe  an  oriented  linear  segment  connecting  two  points.  A  curve  is 
an  ordered  set  of  bonds,  and  thus  corresponds  to  a  piecewise  linear  ordered  curve  in  space. 
All  curves  are  assumed  to  be  continuous.  The  start  and  end  points  of  a  curve  are  called 
nodes.  Several  curves  may  meet  at  the  same  node.  An  interface  is  a  collection  of  curves  and 
nodes. 

Values  for  the  state  variables  that  describe  a  solution  to  system  (2.1)  are  associated  with 
geometric  points  on  a  rectangular  grid.  In  addition,  since  the  tracked  curves  represent 
discontinuities  in  the  solution  function,  two  sets  of  state  values  are  associated  with  each  point 


on  a  curve,  rhcse  correspond  to  the  value  of  the  solution  on  either  side  of  the  curve  at  the 
particular  point.  States  are  also  associated  with  the  start  and  end  of  each  curve.  These  states 
correspond  to  the  tangential  limits  of  the  solution  as  the  end  point  of  the  curve  is  approached 
along  the  curve.  The  solution  in  a  neighborhood  of  a  node  is  described  by  the  start  or  end 
states  of  the  curves  going  into  that  node. 

The  propagation  of  the  solution  from  time  t  to  time  t  -  Ar  is  divided  into  two  -tain 
parts,  the  propagation  of  the  tracked  wave  structures  (the  front  propagation),  and  the  updat¬ 
ing  of  the  values  of  the  states  at  locations  away  from  the  tracked  interface  (the  interior  pro¬ 
pagation).  The  simulations  described  in  this  paper  used  an  operator  split  Lax-Wendroff 
method  for  the  interior  propagation.  This  finite  difference  method  has  been  modified  to  use 
the  states  on  the  tracked  interface  as  boundary  data. 

At  each  non-node  point  P  on  the  tracked  interface  a  one  dimensional  Riemann  problem 
is  solved  for  the  component  of  system  (2.1)  normal  the  curve  through  P.  The  solution  to  this 
Riemann  problem  gives  the  wave  speed  and  a  set  of  updated  states  at  P.  Later  a  second 
sweep  over  the  points  is  performed  in  which  the  contribution  of  the  tangential  component  of 
the  equations  is  included.  See  (13]  for  a  description  of  the  details  of  these  steps.  The  nodes 
are  treated  separately  from  the  non-node  points  on  the  interface,  since  the  solution  is  fully 
two  dimensional  at  such  points  and  operator  splitting  does  not  apply.  The  states  near  a  node 
and  the  tangents  of  the  curves  joining  the  node  define  a  two  dimensional  Riemann  problem 
1 15],  and  the  solution  of  this  Riemann  problem  is  taken  as  the  first  order  solution  near  the 
node.  Sometimes  it  is  also  possible  tc  compute  higher  order  corrections  to  the  states  near  a 
node  that  include  such  effects  as  the  curvature  of  the  incoming  waves  and  the  variability  of 
the  solution  near  the  node. 


Often  an  explicit  expression  for  the  solution  to  a  given  two-dimensional  Riemann  prob¬ 
lem  is  unavailable.  In  such  cases  the  solution  to  the  two-dimensional  Riemann  problem  is 
approximated  by  finding  a  projection  of  the  exact  solution  onto  a  subclass  of  functions  that 
will  capture  the  main  features  of  the  interaction.  The  next  section  will  discuss  such  a  projec¬ 
tion  for  a  node  that  corresponds  to  a  shock-contact  collision. 

A  more  detailed  description  of  the  propagation  of  the  tracked  interface  for  one  time 
step  can  be  found  in  [16]. 


3.  The  Tracking  of  Shock-Contact  Interactions 

The  direct  simulation  of  the  Richtmver-Meshkov  instability  is  based  on  a  numerical 
solution  to  the  Euler  equations  for  a  non-viscous,  non-heat  conducting  gas. 

(  Conservation  of  mass) 


(Conservation  of  momentum) 


(pu),  -i-  (p«; 


( Conservation  of  energy) 


*  (pv)v  =  0, 

(3.1a) 

p)r  -  (puv)v  =  0, 

13. lb) 

-  (pv;  1-  p)v  =  0, 

(3.1c) 

)  f  .  .  1 

-  <l}t  *  {pv[ -  <lj 

■  =  0.  (3. Id) 

The  variables  u  and  v  are  the  x  and  v  components  of  the  gas  velocity  at  the  point  ( x ,  y), 
y-  =  u-  -  v* .  The  thermodynamic  variables  p,  e,  p  and  i  -  e  ~  are  respectively  the  den¬ 
sity.  specific  internal  energy,  pressure  and  specific  enthalpy  of  the  gas.  The  thermodynamic 
variables  for  each  gas  are  related  by  a  caloric  equation  of  state 


e  =  e(r,  5), 


(3.2) 


where  <(t,  S)  is  a  convex  function  of  the  specific  volume  t  =  -j-  and  specific  entropy  5.  In 


general  this  eouation  of  state  will  be  different  for  the  two  gases  on  opposite  sides  of  the  gas 
interface. 

The  pressure  p  is  given  by  p(r,  5)  =  -  Often  this  relation  can  be  inverted  to  give 

:he  entropy  and  hence  the  energy  as  a  function  of  p  and  p.  This  expression,  called  an  incom¬ 
plete  equation  of  state,  is  usually  sufficient  to  solve  the  Euler  equations.  The  numerical 
examples  described  below  used  a  polytropic  equation  of  state. 


wnere  the  ratio  between  the  specific  heats  y  is  a  constant  satisfying  Kys-i.  The  author 

and  his  colleagues  are  actively  pursuing  hydrodynamic  simulations  with  more  general  equa¬ 
tions  of  state.  Thus,  the  equation  of  state  dependencies  in  our  code  have  been  modularized 
to  allow  the  optional  use  of  other  equation  of  state  models.  This  modularity  requires  that  ail 
aydrodynamic  quantities  be  interpreted  in  terms  that  can  be  expressed  for  a  general  equation 
of  state.  In  particular,  this  involves  such  requirements  as  an  equation  of  state  independent 
'ormulation  for  the  solution  of  one  dimensional  Riemann  problems,  and  equation  of  state 
ndependent  expressions  for  the  shock  polars  described  below,  see  (16). 

The  Richtmyer-Meshkov  instability  simulation  is  initialized  at  a  time  shortly  before  the 
ncident  shock  wave  reaches  the  gas  interface.  The  incident  shock  is  taken  to  be  planar,  and 
the  contact  discontinuity  interface  is  given  an  initial  geometry  specified  by  input.  If  a  single 
mode  is  to  be  isolated  this  initial  geometry  is  that  of  a  sine  wave  of  a  single  period  across  the 
computational  domain.  The  gas  interface  is  assumed  to  be  at  rest  with  respect  to  the  gas 
ahead  of  the  incident  shock,  so  the  initial  data  is  piecewise  constant. 

Since  the  two  interacting  waves  are  tracked,  it  is  necessary  to  resolve  the  diffracted 
wave  patterns  that  are  produced  at  the  point  of  collision  between  the  two  waves.  In  the 
simulation  used  here,  these  diffraction  patterns  are  resolved  using  shock  polar  analysis. 
Briefly,  the  interacting  waves  are  approximated  by  their  tangents  near  the  point  of  collision 
between  the  incident  shock  wave  and  the  gas  interface,  and  the  nearby  states  are  approxi¬ 
mated  by  states  that  are  constant  between  the  interacting  waves.  It  is  further  assumed  that 
there  exists  a  Galilean  transformation  that  translates  this  local  approximation  into  a  station¬ 
ary  flow.  This  assumption  will  be  in  general  valid  provided  the  angle  between  the  two 
incident  waves  is  small,  i.e.  if  the  initial  amplitude  of  the  perturbation  of  the  contact  discon¬ 
tinuity  is  sufficiently  small. 


The  analysis  of  the  interaction  between  a  planar  shock  wave  and  a  planar  contact 
discontinuity  for  a  polytropic  equation  of  state  is  well  known,  [14, 17, 18|.  The  type  of  dif¬ 
fraction  pattern  that  is  observed  is  a  function  of  the  strengths  of  the  interacting  waves,  the 
angle  at  which  the  two  waves  meet,  and  the  equations  of  state  for  the  two  gases.  The  simpii- 


;st  of  these  diffraction  patterns  consists  of  the  incident  shock  wave  and  contact  discontinuity, 
a  single  reflected  wave  that  is  either  a  shock  or  a  Prandti-.Meyer  rarefaction  wave,  a  transmit¬ 
ted  shock  wave,  and  a  deflected  gas  interface  behind  the  point  of  interaction.  This  so  called 
regular  shock  diffraction  is  observed  provided  the  angle  between  the  interacting  waves  is  suf¬ 
ficiently  small.  Many  other  configurations  besides  the  regular  diffraction  node  are  possible, 
these  include  Mach  type  reflections  and  precursor  shock  type  configurations,  see  (19.20). 

The  interaction  between  a  steady  state  shock  wave  and  contact  discontinuity  can  be 
regarded  as  a  Riemann  problem  for  the  steady  flow  Euler  equations.  The  line  perpendicular 
to  the  upstream  contact  becomes  the  space-like  axis,  and  the  line  parallel  to  the  upstream 
contact  becomes  time-like  in  the  downstream  direction.  The  data  for  the  Riemann  problem 
consis*s  of  the  state  behind  the  incident  shock  wave,  and  the  upstream  state  on  the  side  of  the 
contact  opposite  to  the  incident  shock.  It  is  assumed  that  both  states  are  supersonic.  Again 
rhis  will  be  the  case  for  sufficiently  small  incident  angles.  A  solution  is  sought  for  this 
Tiemann  problem  in  the  class  of  self-similar  functions  that  consist  of  constant  states 
-eoarated  bv  downstream  oriented  shocks,  Prandtl-Mever  rarefactions,  and  contact 
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discontinuities.  The  abstract  structure  of  this  solution  is  completely  analogous  to  that  of  the 
•ime  dependent  Riemann  problem  in  one  space  dimension.  The  difference  comes  from  the 
increased  nonlinearity  of  the  wave  curves. 

The  characteristics  for  supersonic  flow  come  in  three  families:  streamlines,  and  two 
sonic  wave  families  that  cross  the  streamlines  at  the  Mach  angle  sin(/l)  =  — .  Where  c  is  the 

local  sound  speed  and  q  is  the  flow  speed.  The  characteristic  family  associated  with  stream¬ 
lines  is  linearly  degenerate  and  the  associated  waves  are  contact  discontinuities  or  slip  lines 
across  which  the  pressure  and  flow  angle  are  continuous.  The  sonic  characteristic  families 
are  genuinely  non-linear  provided  the  fundamental  derivative  of  gas  dynamics 

:  =  —  ^  0  (3.4) 

2  a-e( t,  S)/d t2 

For  most  materials  £  >  0,  although  £  may  be  negative  near  phase  transitions.  For  a  polytro- 
pic  gas,  £  -  >  1.  If  it  is  assumed  that  £  >  0,  then  the  sonic  characteristic  wave 

families  support  waves  that  are  either  shocks  or  Prandtl-.Vleyer  rarefaction  waves.  Since  the 
streamline  characteristic  field  is  linearly  degenerate,  it  follows  that  the  solution  to  the 
Riemann  problem  for  steady  planar  flow  can  be  found  by  calculating  the  intersection  of  the 
wave  curves  for  the  two  sonic  families  through  the  data  points  in  the  pressure  -  flow  angle 
phase  space,  see  [16].  The  shock  portion  of  the  these  wave  curves  correspond  to  the  well 
know  shock  polars  as  described  in  [21]. 

Figure  3.1,  shows  a  representative  shock  diffraction  pattern  along  with  a  pair  of  generic 
streamlines  and  the  corresponding  shock  polars  for  the  case  of  a  reflected  shock. 

The  application  of  this  analysis  to  the  direct  simulation  of  the  shock  contact  collision 
consists  of  calculating  at  each  time  step  a  new  steady  diffraction  pattern  based  on  the  chang¬ 
ing  angles  between  the  incident  waves.  The  transformation  from  the  locally  steady  configura¬ 
tion  to  the  global  reference  frame  for  the  entire  simulation  is  found  by  calculating  an  inter¬ 
section  between  the  two  propagated  sections  of  the  incident  shock  wave  and  contact  discon¬ 
tinuity.  This  intersection  defines  the  position  of  the  point  of  shock  diffraction  at  time  t  -  At. 
The  difference  between  the  positions  of  this  point  at  the  beginning  and  end  of  the  time  step 
provides  the  transformation  between  the  two  frames  of  reference. 

There  are  several  important  issues  connected  with  the  changes  in  topology  for  the 
tracked  waves  as  they  collide  and  interact.  These  include  the  numerical  detection  and  identif¬ 
ication  of  the  tracked  wave  interactions,  and  the  changes  to  the  tracked  wave  structures 
needed  to  simulate  the  underlying  physics  of  the  interactions.  See  [16]  for  a  more  detailed 
discussion  of  these  issues. 

4.  Numerical  Results 

Figure  4.1  shows  a  series  of  frames  documenting  the  growth  of  an  unstable  finger  in  an 
air  to  sulphur-hexafluoride  ( SF *)  interface.  Both  gases  are  modeled  as  polytropic  gases  with 
v  =  1.4.  and  y  =  1.094  respectively.  The  shock  wave  is  incident  in  the  air  and  the  ratio  of 
the  pressure  behind  the  shock  to  the  pressure  in  front  is  10.  At  room  temperature  the  SF*  is 
aoout  5.03  times  as  dense  as  air.  A  net  vertical  velocity  is  given  to  the  initial  contact  discon¬ 
tinuity.  This  is  done  since  the  boundaries  at  the  top  and  bottom  of  the  computational  rectan¬ 
gle  are  open,  and  it  was  found  that  the  contact  discontinuity  exits  the  computational  rectangle 
earlv  in  the  simulation  if  a  reference  frame  in  which  the  original  gas  interface  is  at  rest  is 
used. 

The  gas  interface  is  flattened  by  the  incident  shock  wave  as  the  two  waves  collide.  The 
diffraction  of  the  shock  wave  through  the  interface  causes  the  reflected  and  transmitted 
shocks  to  assume  the  geometry  of  the  original  gas  interface.  However,  as  the  waves  continue 
;o  propagate  away  from  each  other,  the  unstable  mode  in  the  contact  discontinuity  interface 
begins  to  grow,  while  the  two  shock  waves  restablize  to  planar  curves.  The  two  shock  waves 
eventually  exit  the  open  boundaries,  leaving  the  contact  discontinuity  as  the  only  tracked 


This  simulation  is  interesting  since  during  the  shock  diffraction  portion  of  the  run,  the 
•ransmitted  shock  wave  is  nearly  contiguous  with  the  deflected  contact  discontinuity  behind 
ne  point  of  diffraction.  The  angle  between  the  two  tracked  waves  is  less  than  1°.  It  is 
Relieved  that  one  strength  of  the  front  tracking  method,  is  the  ability  to  resolve  such  closely 
proximate  waves. 

Figure  4.3  shows  a  similar  interaction,  except  here  the  shock  is  incident  in  the  heavier 
gas.  Both  gases  arc  taken  as  polytropic  with  -y  =  1.4,  while  the  pressure  ratio  across  the 
ncident  snock  is  100  and  the  heavier  gas  is  ten  times  as  dense  as  the  lighter  gas.  One  notes 
•hat  the  phase  of  the  contact  is  reversed  by  the  shock  wave  collision,  and  the  interaction  pro¬ 
duces  a  reflected  rarefaction  wave  rather  than  a  reflected  shock  wave.  The  two  tracked 
waves  on  the  upper  side  of  the  contact  are  the  forward  and  backward  edges  of  the  reflected 
rarefaction  wave.  The  long  time  behavior  of  the  unstable  interface  is  show  in  Figure  4.3d. 
There  is  some  question  about  the  dimple  that  is  produced  in  the  lower  edge  of  the  contact. 
This  mav  arise  as  a  result  of  numerical  instability.  However  there  is  some  evidence  that  this 
dimple  may  be  physical.  Further  studies  using  finer  grids  and  comparison  with  other  simula¬ 
tions  are  needed  to  resolve  this  question. 

In  addition  to  calculations  of  the  growth  of  a  single  finger,  simulations  that  involve 
several  unstable  modes  have  been  performed.  Figure  4.4  shows  a  three  mode  interaction 
with  the  interface  separating  warm  air  from  cooler  air,  figure  4.5  shows  the  interaction  of  a 
shock  wave  incident  in  helium  (y  =  1.63)  with  a  helium  to  air  interface. 

5.  Conclusions 

It  has  been  shown  that  front  tracking  offers  a  useful  method  for  the  simulation  of  shock 
vave  and  contact  discontinuity  interactions.  It  allows  for  a  sharp  resolution  of  the  diffracted 
wave  patterns  produced  by  the  interaction  of  the  two  waves,  and  a  clear  picture  of  the  growth 
of  unstable  modes  in  the  gas  interface. 

The  framework  for  the  resolution  of  tracked  wave  interactions  has  been  shown  to  be 
capable  of  handling  complicated  situations.  Furthermore,  it  is  possible  to  include  new  bifur¬ 
cations  as  they  arc  needed,  or  to  remove  tracking  when  the  result  of  a  wave  interaction  is 
either  too  complicated  or  unknown. 
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(a)  time  0 


( b)  time  0.02 
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(c)  time  0.5  (d)  time  3.5 


10  Ax  ~  10  Ay 

Fig.  4.1.  A  shock  hitting  a  contact  discontinuity  separating  air  from  the  gas  SF6.  The 
contact  discontinuity  curve  is  given  an  initial  shape  of  a  sine  curve.  The  shock  is  in¬ 
cident  from  the  air  and  has  a  pressure  ratio  of  10.  The  boxed  region  in  Fig.  4.1b  is 
blown  up  in  the  next  figure. 
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time  0.02 


deflected  contact 


reflected  shock 


incident  shock 


transmitted  shock 


ahead  contact 


Ax  =  Ay 


Fig,  4.2.  A  blowup  of  a  subregion  of  Fig  4.1b  showing  the  incident  shock  colliding  with 
the  ahead  contact  discontinuity,  producing  reflected  and  transmitted  shocks. 
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(b)  time  0.12 


(a)  time  0 
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Fig.  4.3.  A  shock-contact  interaction  that  produces  a  reflected  rarefaction  wave.  The 
pressure  ratio  across  the  shock  is  100  and  the  density  ratio  across  the  contact  discon¬ 
tinuity  is  10.  Both  gases  are  polytropic  with  y  =  1.4.  The  shock  wave  in  incident  in  the 
heavier  gas. 
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(a)  time  0 


(b)  time  0.04 


(c)  time  0.12 


<= - =»  i 

lOAx  =  10dv  ! 


Fig.  4.4.  A  series  of  frames  showing  a  shock  contact  collision  interaction.  Both  gases 
are  polytropic  with  y  =  1.4.  The  pressure  ratio  across  the  incident  shock  is  100.  and  the 
density  ratio  (above  to  below)  across  the  original  contact  is  2.86.  The  grid  is  40x80. 
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Fig.  4.5.  A  series  of  frames  showing  a  shock  in  helium  (7  =  1.63)  colliding  with  an  air 
(7  =  1.4)  -  helium  interface.  The  pressure  in  front  of  the  shock  is  1  atm.  and  the  pres¬ 
sure  behind  is  1000  atm..  The  density  of  dry  air  at  25°C  is  0.00118497  g/cc  and  the  den¬ 
sity  of  helium  at  the  same  temperature  is  0.000101325  g/cc.  The  grid  is  120x80. 
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ABSTRACT.  We  describe  several  techniques  that  are  based  on  Richardson's 
extrapolation  for  estimating  discretization  errors  of  finite  difference 
solutions  of  one-  and  two-  dimensional  hyperbolic  systems.  These  a  posteriori 
error  estimates  are  intended  for  use  with  adaptive  mesh  moving  and  local 
refinement  procedures.  Mesh  moving  algorithms  produce  nonuniform  grids  which 
necessitate  special  treatment  of  solution  and  error  estimation  techniques. 

The  required  adjustments  are  discussed  using  a  two  step  MacCormack  method  as  a 
model  finite  difference  scheme.  We  also  discuss  automatic  time  step  selection 
procedures  and  the  effects  of  artificial  viscosity.  Extrapolation  schemes 
that  produce  separate  estimates  of  the  temporal  and  spatial  discretization 
errors  are  presented  and  we  show  how  these  may  be  used  to  control  local  mesh 
refinement.  Several  examples  illustrating  these  techniques  are  presented. 

1.  INTRODUCTION.  With  the  use  of  adaptive  methods  to  solve  time-dependent 
partial  differential  equations  there  exists  a  requirement  to  compute  solutions 
on  moving  nonuniform  grids.  There  is  also  a  requirement  to  estimate  the  local 
discretization  error  as  feedback  to  modify  or  refine  the  mesh.  In  this  paper, 
we  discuss  the  MacCormack  finite  difference  scheme  and  a  Richardson 
extrapolation-based  error  estimation  procedure  that  was  used  in  the  adaptive 
algorithm  of  Arney  [3]  and  Arney  and  Flaherty  [4,5]  to  solve  time-dependent 
hyperbolic  systems  in  one  and  two  space  dimensions.  Examples  of  other 
adaptive  methods  with  these  requirements  are  Rai  and  Anderson  [30],  Adjerid 
and  Flaherty  [2],  Bell  and  Shubin  [10],  and  Davis  and  Flaherty  [14]. 
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Finite  difference  methods  use  a  mapping  to  transform  the  time  and  space 
variables  from  a  moving  nonuniform  mesh  to  a  stationary  uniform  mesh.  The 
method  used  to  compute  the  metrics  of  this  transformation  must  be  carefully 
chosen  in  order  to  preserve  the  stability,  conservation,  and  accuracy  of  the 
scheme  (cf.,  Thomas  and  Lombard  [33,34]  and  Hindman  [19,20]). 

The  MacCormack  finite  difference  scheme  has  had  wide  use  in  solving 
Eulerian  conservation  laws  for  fluid  dynamics.  The  recent  use  of  artificial 
viscosity  to  make  this  scheme  total  variation  diminishing  (TVD)  makes  it  more 
attractive  as  a  general  solver  for  problems  with  discontinuities  (cf.  Davis 
[13]  and  Roe  [31]).  The  MacCormack  scheme,  our  implementation  of  the 
differencing  of  the  metric  terms,  adaptive  selection  of  the  time  step,  and  the 
TVD  artificial  viscosity  of  Davis  [13]  are  discussed  in  Section  2.  The 
Richardson's  extrapolation-based  error  estimation  method  produces  a  point  wise 
approximation  of  the  local  discretization  error  which  can  be  used  to  construct 
several  global  measures  of  the  discretization  error.  Our  error  estimate  and 
its  implementation  on  a  moving  mesh  are  discussed  in  Section  3.  In  Section  4, 
we  present  computational  results  of  solutions  of  hyperbolic  problems. 
Computations  were  performed  in  one  and  two  dimensions  on  stationary  uniform 
and  moving  nonuniform  grids.  In  Section  5  we  discuss  the  utility  of  our 
methods,  the  computational  results,  and  future  work. 

2.  SOLUTION  SCHEME.  Consider  the  hyperbolic  vector  systems  of  conservation 
laws  in  two  space  dimensions 


u  +  f  (x.y.u.t)  +  g  (x,y,u,t)  =  0,  (x,y)  (  D,  t  >  0 
u  x  y 

u(x,y,0)  -  uQ(x,y),  (x,y )  £  D  J  3D, 


(2.1) 


(2.2) 


with  appropriate  well-posed  conditions  on  the  boundary  3D  of  a  rectangular 
domain  D. 

We  chose  to  implement  the  MacCormack  finite  difference  scheme  for 
hyperbolic  problems  because  of  its  general  applicability.  The  MacCormack 
scheme,  like  most  higher-order  methods,  will  suffer  a  reduction  in  order  on  a 
moving  nonuniform  grid.  Despite  this  fact,  proper  mesh  moving  and  node 
placement  by  an  effective  adaptive  procedure  provide  enough  efficiency  and 
accuracy  to  compensate  for  this  order  reduction. 

A.  MacCormack  Scheme 


In  order  to  discretize  (2.1)  we  introduce  a  transformation 
5  -  C(x,y,t),  n  -  n(x,y,t),  t  -  t, 


(2.3) 


from  the  physical  (x,y,t)  domain  to  a  computational  (*i,n,T)  domain  where  a 
uniform  rectangular  grid  will  be  used.  Under  this  transformation  (2.1) 
becomes 


u  +  ur£  +  u  n  +  f +  f  n  +  g_C  +  g  n  =0. 
t  n  t  5  x  u  x  C  y  n  y 


(2.4) 


I 

* 


2^ 

$ 

S 


The  transformation  metrics  (C  ,£  ,5  ,n  ,n  ,n  ]  are  related  to  the  metrics 

x  y  t  x  y  t 


(x  .x  ,x  ,y_,y  ,y  )  by  the  identities 
^  n  t  t,  n  t 


s  -  -r  *  « 

x  J  y 
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(y  x  -  s  y  ) 
t  ri  t  n 


y  J  ’ 
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*  J  =  VC  '  Vs  • 


(2.5) 


Using  (2.5)  in  (2.4)  gives 


u  +  u. 
T  C 


fy  x  -  x  y  1  (y„x  -  x_y  ) 
w  T  f|  T  T\J  ♦  v  s  T  c,  T‘/ 


+  fC  r  +  +  «c<T>  +  2  T 


(2.6) 


This  equation  can  be  rewritten  in  another  form  in  the  original  transformation 
metrics  by  further  substitutions  of  (2.5)  into  (2.2)  as 


u  -  urfx  £  +  y  (  ]  -  u  (x  n  +  y  n  1  +  +  f  n  +  gr5  +  g  n  =0.  (2.7) 

x  5^  tx  7x  y;  nv  x  x  }x  V  £  x  n  x  6C  y  6n  y  v  ' 


Some  authors  (cf.,  Hyman  [22]  and  Thompson  [35])  prefer  to  write  this  equation 
in  still  another  form  as 


u  +  u_C  +  u  n  +  f.S  +  g_C  +  g  n  =  0. 
x  E  t  n  t  K  x  C  y  n  y 


(2.8) 


A  uniform  space-time  grid  having  mesh  spacing  Ag  x  An  x  Ax  is  introduced 
onto  the  computational  domain.  The  finite  difference  solution  at 
(£A5,  mAn,  nAx)  is  referred  to  as  5"  .  A  similar  notation  is  used  for  the 

fluxes  f  and  g  and  the  metrics  (cf.  Eq.  (2.5)).  The  two-step  MacCormack  scheme 
[24]  uses  first-order  forward  temporal  and  spatial  difference  approximations 
in  the  predictor  step,  and  first-order  backward  differences  in  the  corrector 


step.  The  predicted  solution  U  satisfies 

£ ,  m 


un+i  -  un  -  —  [fun  -  fin  He  ln  +(fn  -  fn  1  (c  )n 

--£,m  x,m  A  t,  L^u£+i,m  '  l  ,m  ^r£+i,m  r£,nr 


+  ^gJl-H,m  “  g2,m^5y^,m^  '  An  ^U?,m+i  "  US,m^ntU,m 


(2.9) 
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The  metrics  (?  )?  »  etc.  are  computed  by  forward  differences.  The  corrected 
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+n+i 

solution  U.  satisfies 

4,m 


n+i 
tn 


jjn+i  „  i  /fri  +  un+1  -  —  ffun+A  -  un+i  He  '|n+1  +  f£n+1  -  ?n+i  He  )n+ 
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with  metrics  computed  by  backward  differences.  The  notation  £'"  denotes 

^  |  —  ^  )  ID 

f [ U  ),  etc.  The  use  of  first  forward  and  backward  difference  approximations 

— x,  ,m 


for  the  metrics  implies  that  the  transformation  from  the  computational  to  the 
physical  domain  is  piecewise  trilinear  in  space  and  time  for  the  predictor  and 
corrector  steps.  Such  low  order  difference  approximations  are  responsible  for 
reducing  the  orders  of  the  MacCormack  scheme.  A  smoother  transformation  and 
the  use  of  higher-order  difference  approximations  of  the  metrics  could  be  used 
to  maintain  second-order  accuracy. 


It  was  shown  by  Hindman  [19,20]  that  this  differencing  of  Equation  (2.6) 
produces  consistent  approximations.  Therefore,  a  uniform  flow  solution  is 
maintained.  Other  conservative  forms  for  the  transformed  equations  were 
investigated  by  Hindman  [19]  and  found  to  be  less  efficient  or  needing  special 
differencing  of  the  metrics  for  computing  consistent  approximations. 


Equation  (2.4)  is  conservative  on  a  moving  mesh.  We  show  this  for  a 
one-dimensional  scalar  conversation  law  by  investigating  the  Rankine-Hugoniot 
jump  conditions  across  a  shock  discontinuity.  Consider  a  conservation  law  in 
the  form 


(i°°  u  dx)  +  f(u) 


0  . 


(2.11) 


The  jump  conditions  for  a  discontinuity  at  x  «  s(t)  satisfy 

.  If] 

s  -  — -  , 


(2.12) 


ds 


where  [q]  indicates  the  jump  in  q  and  s  *  ^  denotes  the  shock  velocity  [37] 


A  conservation  law  on  a  moving  mesh  produced  by  a  transformation  of 
variables  to  a  uniform  stationary  mesh  satisfies 


f  3  3  \  ,  oo 

3T+5t  !  ux5dc  +f(u) 


0  . 


(2-13) 
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Assuming  the  existance  of  a  shock  discontinuity  C  *  r(t)  gives 


rx  [u]  +jr  (ux  JdC  +  i°°(  ux  Jd5  +  f(u) 

^  — oo  **  r  S  — 

Using  the  chain  rule  provides  an  integrable  form 

rx^[u]  +  jr  (uxT  -  f)^d?  +  ]“( uxT  -  f)?dC  +  f(u) 


(2.14) 


(2.15) 


InCegration  of  this  equation  gives  jump  conditions  in  the  computational  domain 
as 

rx?[u]  -  [f]  +  xT[u]  =*  0  .  (2.16) 

Since  s(t)  and  r(r)  are  related  by 

s  *  r  x^  +  ,  (2.17) 

the  appropriate  jump  condition  (2.12)  is  recovered. 

B.  Variable  Time  step. 

The  explicit  MacCormack  scheme  has  a  stability  restriction  that  limits 
the  time  step  allowed  for  a  given  spatial  mesh.  For  efficient  computation, 
the  time  step  should  be  adaptively  set  close  to  the  maximum  allowed  by  the 
Courant,  Friedrichs,  Lewy  theorem  [27].  Thus,  we  choose 

At  -  - — -  .  (2.18) 

2  !1  max(ip  ,id) 

The  computational  mesh  has  been  selected  to  have  spacing  AC  =  An  =  1  and  the 
constant  0.8  provides  a  twenty-percent  margin  of  safety.  The  quantities  <p 
and  m  are  the  spectral  radii  of  one-dimensional  conservation  laws  on  moving 
meshes,  i.e. , 


'P  *  max[(x,  -  x  )C  +  (p  -  y  )C  ]  , 
1  v  l  i  x  i  t  yJ 

oo  =  max[(A  -  x  )n  +  (P .  -  y  )n  ]  , 
L  i  x  x  i  t  yJ 


(2.19a) 

(2.19b) 


where  A^  and  are  eigenvalues  of  f ^ ( u )  and  g^.(u). 
metrics  in  (2.19)  are  evaluated  at  the  beginning  of 


These  eigenvalues  and  the 
each  time  step. 


C.  Artificial  Viscosity 


The  MacCormack  scheme,  being  a  second-order  accurate  centered  scheme, 
produces  spurious  oscillations  near  discontinuities.  In  order  to  eliminate  or 
reduce  these  oscillations,  artificial  viscosity  or  dissipation  is  added  to  the 
solution  to  diffuse  the  discontinuity.  The  viscosity  is  often  problem 
dependent,  and  considerable  "fine  tuning"  is  usually  needed  to  balance  the 
effects  of  the  spurious  oscillations  and  diffusion  [23] . 

We  use  an  artificial  viscosity  model  due  to  Davis  [13]  which  is  not 
problem  dependent  and  only  requires  knowledge  of  4'  and  m.  This  artificial 
viscosity  model  is  designed  to  convert  the  MacCormack  scheme  into  a  total 
variation  diminishing  (TVD)  scheme  in  one-dimension.  A  scheme  is  TVD  if  the 
total  variation  of  the  solution  to  an  initial  value  problem  is  non-increasing 
in  time.  Recent  research  efforts  have  resulted  in  the  development  of  other 
second-order  accurate  TVD  schemes  (cf.,  Osher  and  Chakravarthy  [28]  and 
Warming  and  Beam  [36]). 

The  artificial  viscosity  of  Davis  [13]  is  based  on  a  flux  limiter  that 
does  not  depend  on  explicitly  determining  the  upwind  direction  and,  with  a 
recent  modification  by  Roe  [31],  does  not  affect  the  region  of  stability  of 
the  MacCormack  scheme.  Because  the  MacCormack  scheme  also  does  not  determine 
the  upwing  direction,  the  combined  use  of  the  MacCormack  scheme  and  Davis's 
artificial  viscosity  is  computationally  simpler  to  perform  than  many  other  TVD 
schemes.  The  artificial  viscosity  terms  are  calculated  from  the  solution  data 
at  the  beginning  of  the  time  step.  For  two  dimensional  problems  separate 
dissipative  terms  are  calculated  in  the  5  and  n  directions  respectively. 

3.  ERROR  ESTIMATION.  Accurate  a  posteriori  error  estimation  is  an  integral 
part  of  an  adaptive  software  system.  Error  estimation  can  be  the  most 
expensive  part  of  an  adaptive  procedure  and  an  important  goal  is  to  find 
accurate  and  inexpensive  ways  of  estimating  the  discretization  error  (cf., 
Babuska,  et  al.  [8,9]).  The  error  estimation  technique  is  dependent  on  many 
factors,  including  the  type  of  solver  used  in  the  algorithm,  the  type  of  error 
to  be  determined,  and  the  norm  in  which  the  error  estimate  is  to  be  measured. 
It  is  most  desirable  to  have  a  procedure  that  provides  pointwise  estimates  of 
the  error  which  can  then  be  used  to  find  estimates  in  several  local  and  global 
norms . 

Mesh  nonuniformity  affects  the  accuracy  and  convergence  of  numerical 
schemes  and  error  estimation.  The  effects  of  the  mesh  on  the  solution  scheme 
have  been  studied  by  Ciment  [12],  Fritts  [17],  Hoffman  [21],  Osher  and 
Sanders  [29],  Sanders  [32],  and  Mastin  [25].  Error  analysis  seems  to  be  more 
natural  and  further  developed  for  finite  element  schemes,  especially  for 
elliptic  and  parabolic  problems  (cf . ,  Adjerid  and  Flaherty  [1,2],  Zienkiewicz 
et.  al.  [38,39],  and  Babuska  and  Rheinboldt  [6,7]),  where  relatively 
inexpensive  local  calculations  are  used  to  provide  accurate  global  spatial 
error  estimates.  More  study  needs  to  be  done  to  find  less  expensive  and  more 
accurate  error  estimates  for  finite  difference  schemes  for  hyperbolic 
problems . 
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We  calculate  the  local  temporal  and  spatial  portions  of  the  discretization 
error,  using  an  algorithm  based  on  Richardson  extrapolation.  Flaherty  and 
Moore  [15,16]  and  Berger  and  Oliger  [11]  also  use  Richardson  extrapolation  to 
estimate  error  on  uniform  meshes  for  their  local  mesh  refinement  algorithms. 


A.  Richardson  Extrapolation  Error  Estimation 


We  develop  the  error  estimate  for  the  second-order  MacCormack  scheme  for 
a  linear  scalar  problem  in  two  dimensions.  Separate  pointwise  estimates  at  a 
general  spatial  node  i,  at  time  t,  for  the  local  temporal  error  E'r(t)  and 

C  *• 


local  spatial  error  E^(t)  are  obtained  with  two  different  extrapolation 


procedures . 


Consider  a  uniform  mesh  with  spacing  Ax  *  Ay  and  time  step  At.  Let  the 
exact  solution  at  node  i  and  time  t  be  denoted  as  uj(t),  the  numerical 
solution  by  the  MacCormack  scheme  at  the  same  point  and  time  as  Uj^t ;Ax,Ay,At) 
and  the  MacCormack  finite  difference  operator  as  L(Ax,Ay,At),  i.e. , 


U^( t  +  At;  Ax, Ay, At)  =  l( Ax,Ay,At)U-£(t;Ax,Ay,At).  (3.1) 

Assume  that  the  local  error  has  a  Taylor's  series  expansion  of  the  form 

u^(t)  -  U^(  t ;  Ax, Ay,  At)  =  Atfc^At2  +  c^x2  +  CjAy2  +  ...]  ,  (3.2) 

are  independent  of  the  mesh  spacing. 


where  the  constants  c^ ,  ,  c3 , . 


To  estimate  the  spatial  component  of  the  error,  we  calculate  a  solution 
on  a  mesh  of  double  spatial  size  (2Ax  x  2Ay)  with  the  same  time  step  (At). 
The  local  error  on  this  mesh  satisfies 


u^(t  +  At)  -  lh(t  +  At;2Ax,2Ay,At)  =  At[c^At2  +  Ac^x2  +  4c3Ay2  +  ••]•  (3.3) 


Subtracting  (3.3)  from  (3.2),  and  neglecting  higher-order  terms,  we  obtain 
an  expression  for  the  leading  team  in  the  spatial  portion  of  the  local  error 
for  the  MacCormack  scheme  on  the  Ax  x  Ay  x  At  mesh  as 


E^(t  +  At):=  At[c2Ax2  +  c3Ay2] 

=  y[Ui(t  +  At ;  2Ax,  2Ay  ,At )  -  lL(t  +  At  ;Ax,Ay,At)]  . 


(3.4) 


Similarly,  an  estimate  of  the  temporal  portion  of  the  local  error, 

E^(t  +  At),  can  be  calculated  by  computing  another  solution  on  the  Ax  x  Ay 
spatial  mesh  using  two  time  steps  of  At/2,  subtracting  this  result  from  (3.2), 
and  retaining  leading  order  terms  as 


E^( t  +  At) :=  At[ctAt2 


(3.5) 


=  y  [  U  t  ( t  +  At  ;Ax,Ay,2(  j—) )  -  U^t  +  At ;  Ax, Ay  ,At )  ] 
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The  leading  terms  of  the  local  error  at  node  i,  and  time  t  +  At,  is 


Ei(t  +  At)  =  E^(  t  +  At)  +  E®(t  +  At)  . 


(3.6) 


There  are  several  disadvantages  to  this  technique  that  should  be  noted: 

(i)  the  error  cannot  be  calculated  for  nodes  on  or  adjacent  to  the  boundary; 

(ii)  the  solution  must  be  smooth  enough  for  the  c^ ,  c  ,  and  c^  to  exist; 

(iii)  the  error  estimation  costs  approximately  three  limes  more  to  compute 
than  the  solution;  and  (iv)  the  mesh  must  be  uniform.  Equation  (3.6)  may 
still  be  useful  as  a  mesh  refinement  or  motion  indicator  even  in  situations 
where  jumps  in  the  solution  render  it  invalid  as  an  estimate  of  the  error. 

Richardson's  extrapolation  can  be  done  in  a  more  classic  manner  provided 
that  we  are  willing  to  forego  separate  spatial  and  temporal  error  estimates. 

We  illustrate  the  method  for  a  one-dimensional  problem.  In  this  case,  the 
error  at  node  i  in  a  solution  on  a  mesh  having  spacing  Ax  x  At  is  estimated 
by  calculating  a  second  solution  on  a  mesh  with  spacing  Ax/2  using  two  time 
steps  of  At/2.  According  to  (3.3)  restricted  to  one-dimension,  the  local 
error  on  this  mesh  satisfies 

u^t  +  At)  -  U  (t  +  At;Ax/2,2(At/2))  =  At[ciAt2/4  +  +  ...].  (3.7) 


Subtracting  (3.7)  from  (3.3)  and  neglecting  higher  order  terms  we  can 
obtain  error  estimates  for  either  U£(t  +  At;Ax,At)  or  Ui(t  +  At;Ax/2,2(At/2)) 
provided  that  node  i  is  common  to  both  meshes.  Our  adaptive  method  carries 
the  fine  grid  solution  forward  in  time;  thus,  we  estimate  its  error  as 


Ei(t  +  At)  =  At (c |At2  +  (^Ax2) 

*  j[Ui(t  +  ~  ui(t  +  At ; Ax, At) ] . 


(3.8) 


Using  this  procedure  the  error  can  now  be  calculated  at  nodes  adjacent 
to  boundaries.  Even  though  this  error  estimate  costs  four  times  more  to 
compute  that  the  solution,  we  only  incur  this  overhead  in  the  first  level  of 
refinement.  No  additional  cost  is  incurred  if  portions  of  the  mesh  have  to  be 
refined,  because  the  solution  on  the  refined  mesh  has  already  been  computed 
and  stored  while  estimating  the  error  for  the  coarser  parent  mesh. 

B.  Error  Estimation  for  a  Moving  Nonuniform  Mesh 

Nonuniformity  of  the  mesh  changes  tne  discretization  error  of  the 
MacCormack  scheme.  For  simplicity,  we  will  determine  this  error  and  analyze 
its  effects  on  the  Richardson  extrapolation  error  estimation  using  a  linear 
scalar  problem  In  one  space  dimension. 

u+bu=0.  (3.9) 

t  x 
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The  local  error  for  the  MacCormack  method  on  a  one-dimensional  moving 
nonuniform  mesh  is 

u.  (t  +  At)  -  (J.(t  +  At ; Ax,  At)  -  At[  -  ~  (Ax"+i  -  Ax?)u 

(3ao) 

Ax 

-  At  b?(l  -  ( — -)u  )  +  c  At2  +  C  Ax2]  , 

Ax?  xx 
l 

where,  Ax?  and  Ax11  are  the  mesh  sizes  on  the  left  and  right  of  node  i  at  time 
X>  it  ri 

step  n,  respectively,  and  Ax  =*  max(Axr  ,Ax^) .  On  the  moving  nonuniform  mesh, 

both  the  temporal  and  spatial  error  components  contain  second  order  terms 
whereas  the  error  on  a  uniform  mesh  is  third  order.  The  previous  analysis  can 
be  used  to  show  that  the  leading  component  of  the  temporal  error  is 


Ej(t  +  At):-  A t [  -  At  b?(l  -  ^4-)uxx] 

Ax. 


(3.11) 


2[ut(t  +  At;Ax, 2(|^))  -  U^t  +  At ;Ax;At)] 


Calculation  of  the  spatial  portion  of  the  error  is  more  difficult  since 
the  temporal  portion  of  the  error  does  not  cancel  upon  subtraction  of 
solutions  calculated  on  two  spatially  different  meshes.  We  overcome  this 
difficulty  and  also  greatly  simplify  the  procedure  in  two  dimensions  by 
constraining  the  mesh  to  maintain  double  size  increments  for  special  nodes  of 
the  moving  coarse  mesh.  This  constrained  grid  structure  consists  of  a  coarse 
mesh,  shown  with  darker  lines  in  Figure  1,  containing  properly  nested  fine 
cells  created  by  binary  division  of  the  sides  of  the  coarse  cells,  shown  by 
lighter  lines  in  Figure  1.  The  vertices  of  the  coarse  cells  are  denoted  as 
"independent  moving  nodes".  Error  estimates  are  calculated  for  these  nodes. 
The  remaining  nodes  in  the  mesh  of  Figure  1  are  "dependent  moving  nodes"  which 
must  be  moved  to  maintain  the  constrained  grid  structure.  A  solution  is 
computed  for  these  "dependent  moving  nodes,”  but  no  error  estimate  is 
obtained. 

For  the  "independent  moving  nodes",  the  spatial  error  calculation  can 
proceed  as  for  a  uniform  mesh;  therefore,  the  local  spatial  error  estimate  is 


E*(t  +  At)  -  At[  -  ^(Axf1  -  Ax£)uxx] 

-  U  (t  +  At ; Ax, At)  -  Ui(t  +  At;2Ax,At). 


(3.12) 


The  above  analysis  extends  directly  to  two  dimensions;  hence,  we  have  a 
Richardson  extrapolation-based  procedure  of  estimating  error  on  a  moving  non- 
uniform  grid-  In  practice,  we  test  the  need  for  local  uniformity  and,  if 
found,  use  formulas  (3.4-6)  to  compute  error  estimates. 


Figure  L.  Spatial  structure  of  the  moving  coarse  mesh  (bold  lines)  with 
embedded  fine  mesh  (fine  lines)  used  for  the  error  estimation. 

Error  estimation  for  systems  of  equations  involves  the  use  of  a  vector 
norm  at  node  i  and  time  t.  The  examples  of  Section  4  use  the  maximum  norm, 
i.e. , 

E  (t)  :=  max  |E  ,(t)|  ,  (3-13) 

1  l  <  i  <  N 


where  N  is  the  number  of  equations  in  the  system  and  E  (t)  is  the  local  error 
estimate  for  the  jth  component  of  the  solution  vector  at  node  i. 

4.  COMPUTATIONAL  EXAMPLES.  The  solution  and  local  error  estimation  procedures 
are  applied  to  four  examples.  In  Example  4.1,  we  demonstrate  the  capability 
of  the  MacCormack  scheme  with  Davis'  (TVD)  artificial  viscosity  on  a  moving 
nonuniform  mesh.  In  Example  4.2,  we  Investigate  a  one-dimensional  problem 
using  a  modified  form  of  the  error  estimate  (3.8,9).  Examples  4.3  and  4.4 
illustrate  the  performance  of  the  error  estimation  procedure  on  a  problem 
having  a  smooth  solution  and  one  with  a  jump  in  the  first  derivative, 
respectively.  We  investigate  the  accuracy  and  convergence  of  the  local  error 
estimator  by  determining  an  effectivity  index 


S  Ell 


(4.1a) 


at  a  fixed  time  t  for  several  different  meshes  and  different  adaptive 
strategies.  Here  e  and  E  are  the  exact  and  estimated  errors,  respectively. 
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(4.1b) 
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is  obtained  by  assuming  E  to  be  a  piecewise  constant  function. 

Example  4.1.  Consider  the  initial-boundary  value  problem 

u  -  yu  +  xu  =0,  t  >  0,  -1.2  <  x  <  1.2,  -1.2  <  y  <  1.2, 

t  x  y 

0  ,  if  (x  -  j)2  +  1.5y2  >  -jj 

1  -  16((x  -  j-)2  +  l*5y2)  ,  otherwise, 

and 


u(x,y,0)  -  { 


(4.2) 


(4.3) 


1 

I 


where 


u(1.2,y,t)  =  u(-1.2,y,t) 


u(x,-1.2,t)  *■  u(x,1.2,t)  *  0  . 


u(x,y,t)  = 


fo 


if  C  <  0 
if  C  >  0, 


(4.4) 

(4.5a) 


1  2 

C  =>  1  -  16((xcost  +  ysint  -  — )  +  1.5(ycost  -  xsint)2).  (4.5b) 

Equations  (4.5)  represent  a  moving  elliptical  cone  rotating  counter¬ 
clockwise  around  the  origin  with  period  2ir .  This  problem  was  proposed  as  a 
test  problem  by  Gottlieb  and  Orszag  [18]  and  was  used  as  a  test  problem  in  a 
survey  by  McRae  et  al.  [26]. 

We  show  the  sequence  of  meshes  that  were  generated  at  t  *  0,  1.6,  and  3.2 
using  the  adaptive  mesh  moving  method  of  Arney  and  Flaherty  [4]  in  Figures  2, 
3,  and  4,  respectively.  Arney  and  Flaherty's  [4]  mesh  moving  method  utilizes 
the  error  estimates  of  Section  3  to  concentrate  the  mesh  in  the  high-error 
region  beneath  the  cone  and  to  follow  it  as  it  rotates.  It  also  increases  the 
accuracy  of  the  solution  and  reduces  oscillations  in  the  wake  following  the 
cone.  However,  small  oscillations  are  still  present.  Next  we  solve  this 
problem  with  the  same  moving  mesh  technique,  by  using  Davis'  [13]  artificial 
viscosity  with  the  MacCormack  scheme.  Surface  and  contour  plots  of  solutions 
with  and  without  artificial  viscosity  are  shown  in  Figures  5  and  6.  There  is 
no  artificial  wake  behind  the  cone  when  artificial  viscosity  is  used. 

However,  the  artificial  viscosity  slightly  diffuses  the  cone,  widening  its 
base  and  reducing  its  peak  from  1.0  to  0.88. 


Figure  6-  Surface  plots  of  the  solutions  of  Example  4.1  on  a  moving 

mesh  without  artificial  viscosity  (top)  and  with  artificial 
viscosity  (bottom)  at  t  *  3.2. 

Example  4.2.  We  consider  an  application  of  the  direct  Richardson's  extrapola¬ 
tion  error  estimation  procedure  (3.9)  to  the  one-dimensional  linear  scalar 
equation 

ut  +  ux  -  0,  t>0,0<x<  0.8,  (4.6) 

with  initial  and  Dirichlet  boundary  conditions  specified  so  that  the  exact 
solution  is 

u(x,t)  =  1  -  tanh  LOO  (x  -  t  -  0.2)].  (4.7) 

This  solution  is  a  relatively  steep  wave  that  moves  at  unit  speed  across  the 
domain. 

We  solved  this  problem  for  one  time  step  on  seven  different  uniform 
meshes  having  N  computational  cells  per  time  step  in  order  to  investigate 
accuracy  and  convergence  of  the  error  estimate.  Table  1  shows  the  results 
obtained  from  these  calculations.  The  effectivity  ratio  appers  to  be 
converging  to  unity. 

We  also  solved  this  problem  using  Arney  and  Flaherty's  [5]  adaptive  local 
refinement  procedure  on  a  base  mesh  having  Ax  =  At  =  0.1  with  a  local  error 
tolerance  of  1/128.  The  mesh  created  by  the  local  refinement  algorithm  is 
shown  in  Figure  7  and  the  solutions  computed  at  each  base  time  step  are  shown 
in  Figure  8. 
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The  adaptive  composite  mesh  of  Figure  7  shows  a  distinct  pattern 
associated  with  using  the  MacCormack  scheme  with  Arney  and  Flaherty’s  [5] 
local  refinement  strategy.  Spurious  oscillations  of  the  solution  on  the  base 
mesh  cause  several  levels  of  refinement  which  drastically  reduce  the  base  mesh 
spacing  at  the  beginning  of  the  each  base  time  step.  However,  once  these 
oscillations  have  been  controlled  the  need  for  refinement  is  reduced  at  the 
later  stages  of  the  adaptive  procedure.  This  situation  could  be  alleviated  by 
including  an  artificial  viscosity  model  with  the  MacCormack  scheme,  as  in 
Example  4.1. 


Exact  Error 
Dell  . 


Estimated  Error 

II Ell  . 


Effectivity  Ratio 
9 


0.1 

8 

.352 

X 

10_i 

.467 

X 

10-2 

0.133 

0.05 

16 

.132 

X 

10_i 

.234 

X 

<NJ 

1 

o 

f— 1 

0.177 

0.025 

32 

.236 

X 

10-2 

.106 

X 

10-2 

0.449 

0.0125 

64 

.256 

X 

10-3 

.138 

X 

10-3 

0.539 

0.00625 

128 

.380 

X 

10~4 

.294 

X 

10"4 

0.773 

0.00312 

256 

.403 

X 

10"5 

.303 

X 

10“5 

0.752 

0.00156 

512 

•  661 

X 

10-6 

.538 

X 

10“6 

0.814 

Table  1.  Exact  and  estimated  errors  for  different  mesh 
sizes  for  Example  4.2. 

Example  4.3.  Consider  the  linear  scalar  hyperbolic  differential  equation 

ut  +  2ux  +  2uy  -  0,  t  >  0  ,  0.2  <  x  <  1.2  ,  0  >  y  1  ,  ( 

with  initial  conditions 

,  (l  -  tanh  3(x  -  .ly  +  .1)) 

u(x,y,0)  -  ^ - }- - ,  ( 

2 


(4.8) 


(4.9) 


and  with  Dirichlet  boundary  conditions  specified  so  that  the  exact  solution  of 
this  problem  is 

fl  -  tanh  3(x  -  .ly  -  1.8t  +  .1)]  . 

U(X,y,t)  »  ^ -  •  (4-10) 


(4.10) 


This  solution  is  a  smooth  wave  that  moves  at  an  angle  of  45  degrees 
across  the  domain.  The  problem  was  selected  to  show  the  convergence  and 
accuracy  of  the  Richardson  extrapolation  error  estimation  procedures  (3.4,5) 
and  (3.11,12.)  We  solve  (4.8,9)  for  one  time  step,  At  =  0.012,  on  eight 
different  meshes.  The  mesh  strategy  of  each  calculation  is  described  as 
follows : 


1)  a  stationary 

2)  a  stationary 

3)  a  stationary 

4)  a  stationary 

5)  a  stationary 

6)  a  moving  (20 

7)  a  moving  (20 

8)  a  moving  (40 


uniform  (10  *  10)  rectangular  mesh, 
uniform  (20  *  20)  rectangular  mesh, 
uniform  (40  x  40)  rectangular  mesh, 
uniform  (60  *  60)  rectangular  mesh, 

(40  x  40)  mesh  of  nonuniform  quadrilateral  cells, 
x  20)  mesh  with  uniform  rectangles, 
x  20)  mesh  of  nonuniform  quadrilateral  cells, 
x  40)  mesh  of  nonuniform  quadrilateral  cells. 


Table  2  shows  the  results  obtained  from  these  calculations  by  comparing  the 
exact  errors  and  the  effectivity  indices  for  the  eight  strategies. 
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Strategies  1-4  show  the  convergence  of  the  error  estimates  on  uniform 
meshes  as  the  number  of  nodes  increase.  These  errors  show  a  rate  of  convergence 
of  0(Ax2,Ay2),  which  is  predicted  in  Eq.  (3.2).  Comparison  of  the  errors  of 
Strategies  3  and  5  show  the  error  is  cut  in  half  by  computing  with  a  better 
nonuniform  stationary  mesh.  Further  comparison  of  Strategies  5  and  7  shows 
another  reduction  of  error  by  half  when  the  mesh  is  properly  moved.  The  non- 
uniformity  of  the  mesh  in  Strategies  5,  7,  and  8,  produces  little  change  in 
the  effectiveness  of  the  error  estimation.  These  nonuniform  mesh  computations 
indicate  a  convergence  rate  0(Ax* *32 .Ay1 *32 ). 


Mesh  Strategy 

(from  above) 

Exact  error 

Hell  i 

Estimated 

error 

II  Ell  L 

Ef  fect^ 
ratio 

9 

1 

0.0111 

0.0071 

0.64 

2 

0.00370 

0.00318 

0.86 

3 

0.000942 

0.000908 

0.96 

4 

0.000367 

0.000368 

1.00 

5 

0.000399 

0.000418 

1.04 

6 

0.00136 

0.00124 

0.91 

7 

0.000411 

0.000370 

0.90 

8 

0.000167 

0.000156 

0.94 

Table  2.  Exact  and  estimated  errors  for  different  mesh 
strategies  for  Example  4.3. 


Example  4.4.  Consider  the  linear  scalar  hyperbolic  differential  equation 

ut  +  ux  +  0. 25uy  =0,  t  >  0  ,  0.2  <  x  <  1.2  .  0  <  y  <  1  ,  (4.11) 


with  initial  conditions 


u(x,y,0)  = 


-  2y  +  3.2 


if  y  <  -4x  +  1.2 
if  y  >  -4x  +  1.6 
otherwise, 


(4.12) 


and  with  Dirichlet  boundary  conditions 


u(x,y,0) 


0  ,  if  y  -  0. 25t  <  -4(x  -  t)  +  1.2 
0.8  ,  if  y  -  0.25t  >  -4(x  -  t)  +  1.6 
-8(x  -  t)  -  2(y  -  0.25t)  +  3.2  ,  otherwise. 


(4.13) 


The  solution  of  this  problem  is  an  oblique  ramp-like  wave  front  that 
moves  at  an  angle  of  14  degrees  across  the  domain.  The  solution  has  a  jump  in 
its  first  partial  derivatives  at  the  top  and  bottom  edges  of  the  wave  front. 

We  expect  some  difficulty  in  estimating  the  error  near  locations  where  the 
derivatives  jump.  In  the  region  of  the  front  itself  the  gradient  of  the 
solution  is  constant  and  there  is  no  error  in  the  solution  or  in  the  error 
estimate. 
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We  solved  this  problem  for  one  time  step,  At 
six  mesh  strategies: 


0.015,  for  the  following 


1)  a  stationary  uniform  (12  x  12)  rectangular  mesh, 

2)  a  stationary  uniform  (24  x  24)  rectangular  mesh, 

3)  a  stationary  uniform  (48  x  48)  rectangular  mesh, 

4)  a  stationary  uniform  (64  x  64)  rectangular  mesh, 

5)  a  stationary  (24  x  24)  mesh  of  nonuniform  quadrilateral  cells, 

6)  a  moving  (24  x  24)  mesh  of  nonuniform  quadrilateral  cells. 


Table  3  shows  the  results  of  these  calculations. 


Mesh  strategy  Exact  error 

Hell 

Estimated 

error  . 

II  Ell 

Ef fectivity 
ratio 

0 

1 

0.0058 

0.0016 

0.28 

2 

0.00275 

0.00110 

0.40 

3 

0.000866 

0.000479 

0.55 

4 

0.000400 

0.000222 

0.56 

5 

0.00144 

0.00078 

0.54 

6 

0.000720 

0.000349 

0.49 

Table  3.  Exact  and  estimated  errors  for  different  mesh  strategies 
of  Example  4-4.  The  error  estimate  is  inaccurate  but  the 
solution  appears  to  be  converging. 
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The  results  are  once  again  as  expected.  The  error  estimate  of  this 
problem  with  a  jump  in  the  derivative  is  not  as  accurate  as  the  smooth 
solution  of  Example  4.3.  However,  the  error  estimate  still  shows  signs  of 


converging  to  the  exact  error  in  for  the  uniform  meshes  of  Strategies  1-4 
Once  again  the  better  nodal  placement  of  the  initial  mesh  by  the  mesh 


generator  of  Arney  [3]  reduces  the  error  by  half  from  a  uniform  mesh.  Also, 
moving  the  mesh  by  the  method  of  Arney  and  Flaherty  [4]  reduces  the  error  by 
half  again. 


5.  CONCLUSION.  We  have  shown  that  MacCormack's  finite  difference  scheme  and 
error  estimation  based  on  Richardson's  extrapolation  can  be  used  on  moving 
grids  with  local  refinement.  With  proper  computation  of  the  transformation 
metrics  and  the  use  of  TVD  artificial  viscosity,  the  MacCormack  scheme  is 
stable  and  is  able  to  solve  problems  with  sharp  discontinuities. 


The  examples  we  have  presented  demonstrate  the  utility  of  these  methods 
and  also  point  out  their  shortcomings.  Of  particular  concern  is  the  lack  of 
any  error  estimation  near  the  boundaries,  the  poor  error  estimation  near 
discontinuities,  and  the  need  to  constrain  the  mesh  to  obtain  any  accurate 
error  estimation.  These  problems  must  be  solved  in  order  to  effectively 
utilize  this  solution  scheme  and  error  estimation  procedure  with  an  adaptive 
technique . 
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ABSTRACT.  We  present  an  adaptive  local  refinement  finite  element  method  for 
solving  vector  systems  of  parabolic  partial  differential  equations  in  two  space  dimen¬ 
sions  and  time.  The  algorithm  uses  the  finite  element-Galerkin  method  in  space  and 
backward  Euler  temporal  integration.  At  each  time  step  we  obtain  an  estimate  of  the 
error  on  each  element,  group  the  elements  whcjc  error  violates  a  user  prescribed  toler¬ 
ance,  form  new  local  grids  and  solve  the  problem  again  on  each  of  the  new  grids.  We 
discuss  several  aspects  of  the  algorithm,  including  the  necessary  data  structures,  the 
error  estimation  technique,  and  the  determination  of  initial  and  boundary  conditions  at 
coarse-fine  mesh  interfaces.  Finally  we  present  several  examples  which  demonstrate 
the  viability  of  our  approach. 


I.  INTRODUCTION.  Over  the  past  several  years  extensive  efforts  have  been 
made  in  using  adaptive  strategies  to  solve  partial  differential  equations  [2,  3].  In  this 
paper,  we  consider  a  local  mesh  refinement  procedure  for  two-dimensional  parabolic 
partial  differential  systems  where  fine  meshes  are  introduced  in  regions  where  greater 
resolution  is  deemed  necessary.  Our  approach  permits  finer  meshes  to  overlap  ele¬ 
ments  of  coarser  ones  and  is  related  to  an  earlier  effort  on  h-refir^ment  methods  for 


Thu  research  waj  partially  supported  by  (he  U.  S.  Air  Force  Office  of  Scientific  Research,  Air  "orce  Systems  Command.  USAF, 
under  Grant  Number  AFOSR  85-0156  and  by  the  SDIO/IST  under  management  of  the  U.  S.  Arr  /  Research  Office  under  Contract 
Number  DAAL  03-86-K-0112. 
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one-dimensional  parabolic  problems  [5,  7,  10]. 

We  consider  an  initial-boundary  value  problem  for  an  m-dimensionai  vector  sys¬ 
tem  having  the  form 

u,  +  t(x  ,y  ,t  ,11,0,  ,u, )  =  [DjCr ,y  ,r ,u)u, ],  +  [D2(x  ,y  ,t  ,u)uy  L ,  (x  ,y )  e  Q,  r  >  0, 


u(x  ,y  ,0)  =  u0(x  ,y ),  (x  ,y )  e  Q  5£2, 


u(x,y,t)  =  gD(x,y,t),  (x,y)ed£}D,  t  >  0,  (lc) 

DiUjT]1  +  D2u>ri2  =  gN(x,y,t),  (x,y)edQN,  t  >  0.  (Id) 

The  domain  Q  is  the  rectangle  {  (jc ,y )  I  a  <  x  <  b,  c  <  y  <  d  )  with  boundary 
=  dClD  (jdClN  and  unit  outer  normal  r\  :=  [r|  1,r|2]r .  The  system  (1)  is  assumed 
to  be  well  posed  and  parabolic,  i.e.,  Dt  and  D2  are  positive  definite.  We  do  not  expect 
that  our  methods  will  be  able  to  solve  all  problems  having  this  generality,  but  our 
one-dimensional  procedure  [10]  has  worked  well  on  a  wide  range  of  linear  and  non¬ 
linear  problems. 

Our  approach  begins  with  the  solution  of  (1)  on  a  uniform  space-time  grid  using 
finite  elements  in  space  and  the  backward  Euler  method  in  time.  At  the  end  of  each 
time  step,  an  indication  of  the  local  discretization  error  is  generated  on  each  finite  ele¬ 
ment  In  our  initial  investigation  of  one-dimensional  problems  [5,  7],  we  used  an  h- 
refinement  (Richardson’s  extrapolation)  procedure  to  compute  a  local  error  indicator. 
This  has  subsequently  been  abandoned  in  favor  of  a  p-refinement  approach  [10],  which 
increases  the  order  of  the  trial  space  instead  of  reducing  the  mesh  spacing.  The  p- 
refinement  strategy  employs  nodal  superconvergence  to  improve  computational 
efficiency  and  it  can  be  used  to  generate  an  asymptotically  correct  estimate  of  the 
discretization  error  [1,  10].  Elements  having  high  error  are  grouped  into  rectangular 
regions  called  megagrids  using  a  nearest  neighbor  clustering  algorithm  (cf.  Berger  and 
Oliger  [4]).  Overlapping  fine  uniform  grids  are  generated  within  the  megagrids  and 
(1)  is  solved  again  on  these  grids.  This  process  is  repeated  until  a  prescribed  local 
error  tolerance  is  satisfied.  An  illustration  of  a  coarse  spatial  mesh  with  two 
megagrids  and  three  fine  grids  is  shown  in  Figure  1. 

A  tree  is  a  natural  data  structure  to  manage  the  information  associated  with  all  of 
the  grids.  Nodes  of  the  tree  represent  data  at  the  megagrid  level,  with  finer  megagrids 
regarded  as  offspring  of  coarser  ones.  Information  associated  with  overlapping  fine 
grids  within  each  megagrid  are  stored  as  records  at  the  nodes  of  the  tree. 

A  finite  element  problem  is  formulated  and  solved  on  each  grid  within  a 
megagrid.  This  necessitates  the  prescription  of  appropriate  initial  and  boundary  condi¬ 
tions  on  each  space-time  grid.  Since  our  temporal  integration  is  implicit,  prescribing 
boundary  conditions  is  particularly  complex  in  regions  where  meshes  overlap  (cf.  Fig¬ 
ure  1).  An  iterative  procedure,  analogous  to  Schwarz  alternation  (cf.  Dihn  et  al.  [6]), 
is  used  to  successively  calculate  solutions  on  fine  grids  within  each  megagrid.  We 
observe  that  this  procedure  converges  for  a  variety  of  problems,  but  have  no  analysis 
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Figure  1.  Coarse  spatial  background  mesh  with  two  offspring  megagrids 
(marked  with  diamonds  and  squares)  and  their  local  grids.  High-error  ele¬ 
ments  of  the  coarse  mesh  are  indicated  by  x’s. 


demonstrating  either  convergence  or  stability.  Starius  [11]  obtained  some  stability 
results  on  a  similar  method  for  hyperbolic  equations. 

A  description  of  the  data  structures  and  the  local  refinement  procedure  is  given  in 
Section  EL  In  Section  IK  we  present  the  finite  element  method  and  the  local  error  esti¬ 
mation  technique.  Section  IV  contains  some  preliminary  computation  results  on  three 
linear  parabolic  problems.  Our  conclusions  and  plans  for  further  improvements  are 
described  in  Section  V.  The  examples  indicate  that  the  error  estimation  procedure 
converges  to  the  true  discretization  error  as  the  mesh  is  refined  and  the  solution  pro¬ 
cedure  based  on  the  Schwarz  alternating  technique  converges. 


H.  LOCAL  REFINEMENT  AND  DATA  STRUCTURES.  We  outline  our  pro¬ 
cedure  for  solving  (1)  on  an  arbitrary  hexahedral  megagrid  Rico# ,q >F ,S JL).  The 
domain  co  :=  (Oc,y)  I  a  <  x  <  J3,  y  <  y  <  5);  p  and  q  are  the  times  at  the  beginning 
and  end  of  the  time  step,  respectively;  F  and  S  point  to  the  parent  and  offspring 
megagrids,  respectively;  and  L  is  the  record  of  information  for  the  a  local  rectangular 
grids  within  R . 

A  top  level  description  of  our  local  refinement  algorithm  is  presented  in  Figure  2. 
A  solution  and  error  indicators  are  generated  on  R  using  procedure  solve  (cf.  Figure 
3).  Elements  where  the  error  indicator  exceeds  a  prescribed  tolerance  tol  are  parti¬ 
tioned  into  rectangular  regions  using  the  nearest  neighbor  clustering  algorithm.  As 
noted,  we  call  these  regions  megagrids.  Berger  and  Oliger’s  [4]  bisection  and  merging 
procedure  is  used  to  generate  local  uniform  fine  grids  for  each  megagrid.  Local  grids 
within  a  megagrid  can  overlap,  but  the  megagrids  are  independent  of  each  other, 
hence,  each  offspring  megagrid  may  have  different  spatial  and  temporal  refinement 
factors.  This  also  reduces  communication  between  megagrids  and,  thus,  simplifies  the 
computation  of  initial  conditions  on  offspring  megagrids.  This  representation  may 
additionally  be  suitable  for  execution  on  parallel  computers.  Temporal  refinement  fac¬ 
tors  are  calculated  and  solutions  are  recursively  generated  for  each  megagrid. 

In  order  to  solve  problem  (1),  the  procedure  locref  is  invoked  on  the  coarse  grids 
R(Q,tk,tk+l, 0J,L),  £=0,1,  ••••  Solutions  satisfying  the  prescribed  accuracy 
requirements  are  generated  at  each  time  tk,  k  =  1,  2, 

The  solution  on  a  megagrid  R  (co,/?  ,q  J7  ,5  L )  is  described  by  the  procedure  solve 
of  Figure  3.  Initial  conditions  are  generated  for  each  local  computation  grid  contained 
in  R .  Following  this,  we  compute  an  initial  guess  for  the  boundary  conditions  of  the 
local  grids  using  either  the  prescribed  boundary  data  at  physical  boundaries  or  linear 
interpolation  in  time  from  the  parent  megagrid  of  R.  A  finite  element  solution  is  gen¬ 
erated  for  one  of  the  local  grids  and  its  solution  is  used  to  update  boundary  conditions 
on  ail  other  intersecting  local  grids.  This  solution  process  is  repeated  on  each  local 
grid  in  turn  until  satisfactory  convergence  is  attained.  Our  procedure  is,  thus,  similar 
to  the  Schwarz  alternating  principle  for  elliptic  problems,  which  has  been  used  recently 
to  develop  domain  decomposition  methods  for  parallel  computation  [6,  8]. 

A  local  grid  is  denoted  as  T(xm,ym,dx,dyj).  Each  local  rectangular  grid  is 
characterized  by  the  coordinates  of  its  center  ( xm,ym ),  the  lengths  of  its  sides  dx  and 
dy ,  and  the  slope  j  of  a  side  of  the  rectangle.  In  order  to  avoid  ambiguity,  we  choose 
s  5  0  and  let  dx  correspond  to  this  side  (cf.  Figure  1).  The  number  of  elements 
mi  xrti  on  local  grid  T,  =  T  ((xm)i  ,(ym)i  Xdx)i,(dy)i  Ji)  is  determined  by  a  single  mesh 
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procedure  locref  (R  (to#  4  f  J»,L  ),tol ); 

begin 

solve  (/? (co^j aSJSA)); 
if  any  error  indicator  >  tol  then 
begin 

Form  offspring  megagrids; 
for  j  :=  1  to  number  of  offspring  do 
Create  local  rectangular  grids; 
for  j  :=  1  to  number  of  offspring  do 
Calculate  the  temporal  refinement  factor  tref  [/']; 
for  j  :=  1  to  number  of  offspring  do 
for  i  :=  1  to  tref  \J  ]  do 
begin 

p[i]  :=  p+(i-l)*(q-p)/treflj]; 
q[i]  :*  p[i]  +(q-p)/tref]j]; 
locref  (R  (V)[j]j>[i]4  [i], 

R  (.to  A  A  f  ,S  JL  ),S  [j  ]  ^  [ j  ]),tol ) 

end 

end 

end  {  locref  }; 


Figure  2.  Recursive  local  refinement  algorithm  for  the  solution  of  (1.1)  on 
R  (to a  a  JF  ,Sf,)  with  an  error  tolerance  tol . 


spacing  parameter  hR  as  m4-  =  round (dx/hR)  and  n(  =  round (dx/hR).  Thus,  each  local 
grid  in  R  has  approximately  the  same  spatial  resolution.  Many  details  of  this  algo¬ 
rithm  have  been  omitted  and  additional  information  is  presented  in  Moore  [9].  For 
example,  a  strategy  has  been  developed  for  storing  the  finite  element  solution  at  p  and 
q  without  unnecessary  duplication  or  copying  of  information. 


Initial  conditions  for  each  local  grid  are  either  determined  from  (lb)  when  p  =  0 
or  by  bilinear  interpolation  using  the  finest  grids  available  in  the  tree  structure  at  time 
p  >  0.  Isolating  local  grids  within  megagrids  greatly  simplifies  the  search  for  data 
needed  for  this  bilinear  interpolation.  Thus,  the  search  for  a  solution  value  at  an  arbi¬ 
trary  point  is  performed  at  the  megagrid  level  until  the  finest  megagrid  containing  the 
point  has  been  identified.  The  local  grids  of  this  finest  megagrid  provide  the  necessary 
interpolation  data.  Scanning  the  points  of  a  grid  in  a  predetermined  order  can  be  used 
to  further  reduce  the  complexity  of  the  search  procedure. 


Similar  considerations  are  required  to  determine  boundary  conditions  on  grid 
edges  that  are  not  subsets  of  dd.  Our  one-dimensional  techniques  [10]  and  the  expli¬ 
cit  finite  difference  procedures  of  Berger  and  Oliger  [4]  used  the  notion  of  a  "buffer" 
to  apply  boundary  conditions.  The  idea  is  to  enlarge  a  local  rectangular  grid  by 
increasing  dx  and  dy  by  two  or  four  elements  so  that  "artificial  boundary  conditions" 
may  be  obtained  from  data  in  low-error  regions.  However,  in  regions  where  local 
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procedure  solve  ( R (G),p 4  f  JSJ,)); 

begin 

for  i  :=  1  to  number  of  local  grids  do 
begin 

Compute  inidal  conditions  for  local  grid 

Compute  boundary  conditions  for 

T  (C*m ),  ,(ym ),  Xdx  \  ,(dy  \  * ); 

end 

for  j  :=  1  to  number  of  iterations  do 
for  i  :=  1  to  number  of  local  grids  do 
begin 

Solve  the  finite  element  problem  for  (1)  on 
T((xn)i,(ym)iXdx)i,(dy)i^i); 
if  j  =  number  of  iterations  then 
Compute  error  on  T  ((xm ){ ,(ym  ),•  ,{dx  \  ,{dy  ),•  ,st ) 
Update  appropriate  boundary  conditions 
end 

end  (  solve  }  ; 


Figure  3.  Solution  algorithm  on  megagrid  R ( 0),p  ,q  .F  JS  JL)  . 


grids  overlap,  accurate  boundary  conditions  cannot  be  obtained  from  parent  grid  data 
even  with  a  buffer.  Buffers  do  provide  accurate  boundary  conditions  in  regions  where 
grids  do  not  overlap  and,  for  this  reason,  we  continue  to  use  them. 

Dirichlet  boundary  conditions  are  obtained  on  the  edges  of  buffered  local  grids  by 
piecewise  bilinear  interpolation  in  time  using  solution  values  from  the  parent  megagrid. 
In  non-overlapping  buffered  regions,  the  interpolated  boundary  conditions  satisfy  the 
prescribed  error  tolerance  and  are,  thus,  expected  to  produce  acceptable  accuracy.  As 
noted,  accurate  boundary  conditions  are  obtained  in  regions  where  local  grids  overlap 
by  means  of  the  Schwarz  alternating  principle.  Hence,  we  initially  solve  a  finite  ele¬ 
ment  problem  on  local  grid  7^  of  R ,  realizing  that  the  interpolated  boundary  data  may 
be  inaccurate  in  regions  where  T  j  intersects  other  local  grids.  In  solving  the  problem 
on  2"2  we  use  boundary  data  from  T  j  with  bilinear  interpolation  in  regions  where  T  t 
and  r2  intersect  This  sequential  updating  procedure  can  be  continued  iteratively  until 
satisfactory  convergence  is  obtained.  In  practice,  we  halt  the  iteration  after  a  few 
cycles  and  compute  an  error  estimate  for  each  local  grid  in  R.  The  grids  of  R  are 
refined  if  the  error  tolerance  is  not  satisfied.  Thus,  we  do  not  distinguish  between 
failure  of  the  Schwarz  iteration  to  converge  and  failure  to  satisfy  prescribed  accuracy 
conditions. 

Treatment  of  situations  where  local  grids  overlap  dQ  are  considerably  more  com¬ 
plex.  A  second  complication  arises  when  a  local  grid  crosses  the  boundary  of 

Of 

(Tf)i,  where  the  subscript  F  denotes  the  parent  megagrid  of  R.  These  issues  are 
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handled  by  regridding  as  described  in  Moore  [9]. 

HL  SPATIAL  AND  TEMPORAL  DISCRETIZATION.  As  noted,  the  partial 
differential  system  (1)  is  discretized  on  a  local  grid  T  of  R  using  a  finite  element 
Galerkin  procedure  in  space  and  the  backward  Euler  method  in  time.  For  each  time 
1 6  \p  4],  we  assume  that  u€  and  select  a  test  function  ve  Hq  ,  where  Hl  denotes 
the  usual  Sobolev  space.  Functions  that  further  satisfy  Dirichlet  conditions  on  BT  are 
said  to  belong  to  Hg,  while  functions  satisfying  trivial  Dirichlet  conditions  belong  to 
//o1- 

The  Galerkin  form  of  (1)  on  T  is 


(v,uf)  +  (v,f(v,t ,u,ux ,Uy ))  +  A  (v,u)  =  j  vrgNds,  for  all  ve  H J  ,  (2a) 


where 


(v,u)  =  J  vru  dxdy, 

T 


A  (v.u)  =  fCvjDl(x  ,y  ,t  ,u)ux  +  vjD20t  ,y  J  ,u)iiy  ]dxdy.  (2c) 

Initial  conditions  are  required  at  p  =  0  and  these  can  be  obtained,  e.g.,  by  L 2  or  Z/1 
projection.  Initial  conditions  for  p  >0  trivially  follow  from  the  solution  at  the  end  of 
the  previous  time  step. 

A  finite  element  solution  of  (2)  is  obtained  by  approximating  Hl  by  a  finite 
dimensional  subspace  K  cf  piecewise  bilinear  polynomials  on  T .  The  finite  element 
solution  U  satisfies 

(V,Ul)  +  (V,f(v,/,U,Ux,U)I))  +  A(V,U)=  j  VTgNds. 

for  all  V  e  K0.  (3a) 


\Kx,y,p)  = 


P(uq). 


P  =  0 


P(U(v£")),  p  >  0. 


The  projection  P  at  p  =  0  is  obtained  by  constructing  a  piecewise  bilinear  approxima¬ 
tion  of  u<).  For  p  >  0,  we  proceed  in  a  similar  manner  except  that  we  construct  inter- 
polants  using  the  finest  grid  solution  available  at  r  -  p~. 

Temporal  discretization  of  (3)  is  performed  by  the  backward  Euler  method;  thus, 
we  determine  Lr<? (x  ,y )  as  discrete  approximation  of  U (x,y,q)  by  solving 

(V,U«)  +  Ar[(V,f(v,r,U^U*,U*))  +  A  (V.U*)]  =  (V,U')  + 
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A/  |  VTSN(xy4)ds,  for  ail  V e  K0,  (4) 

dr 

Initial  conditions  for  the  discrete  system  (4)  follow  the  lines  of  (3b)  for  the  semi- 
discrete  system. 

A  posteriori  estimates  of  the  discretization  error  of  the  solution  of  (4)  are 
obtained  by  means  of  a  p-refinement  technique.  To  begin,  we  calculate  a  second  solu¬ 
tion  U fj>(x,y)  of  (2)  using  piecewise  quadratic  polynomials  in  space  and  trapezoidal 
rule  integration  in  time.  This  solution  is  higher  order  in  both  space  and  time  than  the 
solution  of  (4);  thus,  the  difference  ||lf*  -  ||t  furnishes  an  estimate  of  the  discretiza¬ 

tion  error  of  U*7 .  The  computational  efficiency  of  this  procedure  can  be  substantially 
improved  by  using  the  nodal  superconvergence  property  of  finite  element  methods  for 
parabolic  problems  [1,  10].  Nodal  superconvergence  implies  that  bilinear  finite  ele¬ 
ment  solutions  converge  at  a  faster  rate  in  space  at  nodes  than  elsewhere.  These  con¬ 
siderations  imply  that  can  be  calculated  as 

U $(x,y)  =■  (x ,y )  =  0?(x,y)  +  E*(;c,y),  (5) 

where  If7 (x ,y )  is  a  piecewise  bilinear  function  and  E q(x,y)  is  a  piecewise  serendipity 
function  (a  biquadratic  polynomial  less  a  quartic  term)  that  vanishes  at  the  nodes  of  T. 
Specifically,  we  find  that  (x  ,y )  satisfies 


(P  —  UP 


)  +  V*[(V,f(v4,0^0«,Ctf))  +  (V.f(  v,p,  UMJf.Uf))] 


V4[A  (V.U* )  +  A  (V,  UP  )]  =  'A  j  VTgN(x  ,y  ,q  )ds  +  A  j  VTgN(x  ,y  ,p  )ds 

dT  (~\dcin  dT 

for  all  V  e  K0.  (6) 

Thus,  a  trapezoidal  rule  integration  step  is  performed  using  the  backward  Euler  solu¬ 
tion  Up (x,y)  as  an  initial  condition.  Both  (4)  and  (6)  are  a  nonlinear  algebraic  system 
which  we  solve  by  Newton’s  method.  In  order  to  reduce  the  computational  effort 
associated  with  assembling  and  solving  (6),  the  Jacobian  of  (4)  is  used  for  both  New¬ 
ton  iterations.  The  solution  of  (4)  is  obtained  first  and  the  result  \Sq  (x  ,y )  is  used  as  an 
initial  guess  for  U<?  ( x  ,y ). 

The  piecewise  quadratic  correction  Eq  (x  ,y )  satisfies 

(V,[(U?  +Eq)  -  (\J>P+EP ) J/Ar )  +  ,/2[(V,f(-,-,^,C<?-E£?.u;+E^U/+E/)) 

+  (V,f(-,-^7  ,VP  +Ep  ,UP+EP,\JP+EP))]  +  i/2[A(V,C,'+E‘0  +  AfV.lF+E'7)] 

-  J  vrgK  (x,y,q)ds  +V. 2  J  Vr  giV  (x  .  y  ,p  )ds  for  all  V  €  (7) 

dT  dT  r^dCi" 
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As  noted,  the  space  K$  consists  of  piecewise  serendipity  functions  that  vanish  at 
the  vertices  of  the  elements.  Trivial  initial  conditions  are  used  in  the  solution  of  (7) 
for  p  >  0.  Interpolated  values  of  the  initial  error  u°(x,y)  -  U(x,y,0)  onto  are 
used  at  p  =  0. 

Linear  systems  associated  with  the  application  of  Newton’s  iteration  to  (4),  (6), 
and  (7)  are  solved  by  the  Lanczos  acceleration  of  the  Jacobi  iterative  method  as  imple¬ 
mented  in  the  iterative  solution  package  ITPACK  of  Young  and  Mai  [12]. 

IV.  EXAMPLES.  We  consider  a  sequence  of  three  linear  problems  that  are 
designed  to  illustrate  the  performance  of  our  error  estimation  and  local  refinement  pro¬ 
cedures  and  convergence  of  the  Schwarz  iteration.  Our  results  are  very  preliminary 
and  additional  computational  work  and  analysis  will  be  necessary  before  firm  conclu¬ 
sions  can  be  drawn. 

Performance  of  our  error  estimation  technique  is  measured  by  the  effectivity  ratio 

||U*  -  fall! 

9  =  — - eili — ,  (8) 

llu(*  ,y  a  )  -  U’ll! 

which  is  a  ratio  of  the  estimated  to  the  actual  error  in  the  Hx  norm.  Ideally,  the 
effectivity  ratio  should  approach  unity  as  the  mesh  is  refined  and  should  not  differ  sub¬ 
stantially  from  unity  over  a  large  range  of  mesh  spacings.  The  convergence  of  our 
error  estimate  to  the  true  discretization  error  has  been  established  for  one-dimensional 
linear  problems  [10]. 

Example  1.  Consider  the  linear  constant  coefficient  heat  conduction  problem  on 


Q  :=  {(x,y)  I  0  <x,y  <  7t } 

u ,  =  Vififa  +  Uyy),  (x,y)e  Q,  t  >  0,  (9a) 

u  ( x  ,y  ,0)  =  sin*  siny ,  (x  ,y )  e  Q  (9b) 

u(x,y,t)  =  0,  (x ,y ) e  3Q,  t  >  0.  (9c) 

The  exact  solution  of  this  problem  is 

u(x,y  ,t)  =  e~‘  u(x  ,y  ,0).  (10) 


We  solved  (9)  for  a  single  time  step  on  uniform  grids  having  equal  temporal  and 
spatial  mesh  spacings  of  nJJ ,  J  =  10,  20,  40.  The  exact  error  and  effectivity  ratio  are 
presented  in  Table  1.  The  results  indicate  that  the  finite  element  solution  is  converging 
at  a  linear  rate  and  that  the  effectivity  ratio  is  converging  to  unity. 

Example  2.  Consider  the  forced  heat  conduction  equation  on 
Q  :=  { (x  ,y )  I  0  <  x  ,y  <  1  } 


llu(x,y,Ar  )-£/*! 
0.1578 
0.0882 
0.0469 


8 

1.050 

1.012 

1.003 


Table  1.  Error  and  effectivity  ratio  for  one  time  step  and  uniform  spatial 
meshes  of  spacing  jc !J  for  Example  1. 


u,  +f(x,y,t)  =  u„  +  uyy,  Cx,y)eQ,  r  >  r0. 


with  f(x,y,t)  and  the  initial  and  Dirichlet  boundary  conditions  specified  so  that  the 
exact  solution  is 


u(x,y,t)  =  sinjcte'-20^-^1^5’-1'4^. 


With  t0  =  0.5,  we  solve  (1 1)  for  one  time  step  on  uniform  grids  having  equal  temporal 
and  spatial  meshes  of  1/7,7  *  10,  20,  40.  Results  similar  to  those  of  Example  1  are 
displayed  in  Table  2.  Thus,  once  again,  the  error  is  converging  to  zero  at  a  linear  rate 
and  the  effectivity  ratio  is  tending  to  unity  and  is  close  to  unity  for  all  meshes.  In  this 
example,  as  opposed  to  Example  1,  the  effectivity  ratio  appears  to  be  converging  to 
unity  from  below.  In  practice,  an  upper  bound  is  more  suited  to  an  adaptive  local 
refinement  procedure. 


llu(x,y  ,Ar)  -  U*\ 
0.6796 
0.3383 

_ 0.1668 _ 


8 

0.996 

0.998 

0.999 


Table  2.  Error  and  effectivity  ratio  for  one  time  step  and  uniform  spatial 
meshes  of  spacing  Jt/7  for  Example  2. 


We  also  solve  (11)  for  0  =  to  ^  1  ^  1  using  the  adaptive  local  refinement  strategy 
of  Section  II  with  a  tolerance  of  0.05  and  an  initial  10  x  10  mesh  having  a  time  step  of 
0.1.  Surface  renditions  and  contour  plots  of  the  solution  at  t  -  0.3,  0.5,  and  0.8  are 
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Example  3.  Consider  the  forced  heat  conduction  equation  (11)  on 
Q,  :=  { I  0  <  x,y  <  1  )  with  f(x,y,t)  and  the  initial  and  Dirichlet  boundary  con¬ 
ditions  specified  so  that  the  exact  solution  is 

u(x,y,t)  =  1.0  -  tanh[10(x+y-r-0.45)].  (13) 

This  example  is  used  to  verify  convergence  of  the  Schwarz  alternating  principle.  The 
problem  is  solved  for  a  single  time  step  with  t0  =  0.5  on  an  initial  uniform  coarse 
10x10  mesh  having  a  time  step  of  0.1  and  a  tolerance  of  0.05.  Refinement  was 
needed  at  the  initial  time  and  10  local  grids,  as  shown  in  Figure  6,  were  introduced. 
The  initial  coarse  mesh  is  also  shown  as  a  reference.  Schwarz  iterations  were  per¬ 
formed  on  these  grids  and  we  measure  the  difference  in  successive  solutions  on  alter¬ 
nating  grids  on  .the  portions  of  the  boundaries  of  each  local  grid  in  regions  where  they 
overlap.  The  maximum  such  difference  after  each  Schwarz  iteration  is  shown  in  Table 
3.  It  appears  that  the  iteration  is  converging  at  nearly  a  quadratic  rate. 


Iteration 

Maximum  Difference 

1 

0.1506 

2 

0.0114 

3 

0.0016 

4 

0.0004 

Table  3.  Maximum  difference  between  solutions  on  the  boundaries  of  over¬ 
lapping  grids  after  each  Schwarz  iteration. 


V.  CONCLUSIONS*  We  developed  an  adaptive  local  mesh  refinement  procedure 
for  nonlinear  parabolic  systems  on  rectangular  regions.  A  complex  tree  data  structure 
is  used  to  manage  a  nest  of  local  overlapping  grids.  An  implicit  finite  element  solu¬ 
tion  strategy  using  piecewise  linear  approximations  and  the  backward  Euler  method  is 
formulated.  We  obtain  an  estimate  of  the  local  discretization  error  of  these  finite  ele¬ 
ment  solutions  using  a  p-hierarchical  approach  with  piecewise  serendipity  approxima¬ 
tions  and  trapezoidal  rule  integration.  The  Schwarz  alternating  principle  is  used  to  cal¬ 
culate  boundary  conditions  on  portions  of  local  grids  that  overlap. 

Our  results  indicate  that  the  error  estimation  procedure  converges  to  the  exact 
local  error  as  the  mesh  is  refined.  As  noted,  a  proof  of  this  convergence  has  been 
established  for  certain  linear  one-dimensional  problems  <cf.  Moore  and  Flaherty  [10]). 
It  should  be  possible  to  construct  a  proof  of  convergence  of  the  two-dimensional  error 
estimate  using  the  ideas  developed  in  the  one-dimensional  case.  The  use  of  the 
Schwarz  alternating  principle  also  appears  to  be  a  very  efficient  method  of  calculating 
boundary  conditions  in  overlapping-grid  regions. 
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We  are  encouraged  by  the  performance  of  our  methods  on  these  preliminary 
problems;  however,  several  aspects  of  our  approach  need  improvement  The  Lanczos 
iteration  used  to  solve  the  linear  system  appeared  to  be  far  less  than  optimal.  The 
stopping  criteria  used  in  the  ITPACK  [12]  implementation  was  too  conservative  for 
our  applications.  Creation  of  local  solution  grids  is  difficult  and  complex  near  domain 
boundaries.  At  present  we  know  of  no  way  of  improving  this  defect  We  have  plans 
of  extending  our  methods  to  non-rectangular  domains  using  an  overlapping-grid  mesh 
generation  procedure. 
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Propagator  Matrices  in  the  Solution  of  EMP 

Problems 


K.  C.  Heaton 

Defence  Research  Establishment  Valcartier 


Abstract 

A  general  solution  to  the  problem  of  the  electric  and  magnetic  fields  pro¬ 
duced  by  extremely  energetic  explosions  is  difficult  to  obtain  since  boundary 
conditions  near  the  explosion  site,  at  infinity  and  at  any  intermediate  conduct¬ 
ing  surface  must  all  be  satisfied.  In  particular,  when  these  explosions  occur 
near  the  surface  of  the  Earth,  the  conductivity  of  the  Earth  is  usually  great 
enough  that  the  tangential  component  of  the  quasi-static  electric  field  and  the 
normal  component  of  the  quasi-static  magnetic  field  along  the  surface  of  the 
Earth  must  vanish. 

In  this  work,  the  complete  set  of  boundary  conditions  for  the  electric  and 
magnetic  fields  at  infinity,  between  the  air  and  the  Earth’s  surface,  and  between 
the  air  and  the  perfectly  conducting  plasma  close  to  the  explosion  site  are 
derived.  The  field  equations,  source  functions  and  boundary  conditions  are 
written  in  terms  of  spheroidal  and  torsional  vector  fields.  It  is  shown  that,  in 
this  form,  a  propagator  matrix  formalism  which  automatically  guarantees  that 
all  boundary  conditions  are  satisfied  can  be  developed  to  solve  the  equations  for 
the  electric  and  magnetic  fields.  The  propagator  matrix  formalism  developed 
in  this  work  is  applied  to  the  numerical  solution  of  Maxwell’s  equations  for 
the  electric  and  magnetic  fields  for  the  case  of  a  typical  explosion.  It  is  found 
that  the  boundary  conditions  along  the  surface  of  the  Earth  impose  consistency 
conditions  which  must  be  satisfied  by  the  individual  multipoles  of  the  fields, 
as  well  as  by  the  source  current  densities  produced  by  the  original  explosion. 
Values  are  obtained  for  the  electric  and  magnetic  fields  and  compared  with 
experimental  results. 


1  Introduction 

Electric  and  magnetic  fields  of  appreciable  magnitudes,  capable  of  being  detected  for 
considerable  distances,  accompany  energetic  explosions  (Glasstone  and  Dolan  1977). 
When  the  explosions  are  caused  by  chemical  explosives,  the  fields  are  generated  by 
the  compression  of  magnetic  flux  within  the  ionised  gases  at  accelerating  shock 
fronts  (Wilhelm  1984,  1983)  and  by  the  dust  cloud  formed  by  the  explosion  (Bacon 
and  Cherin  1984).  For  the  case  of  nuclear  explosions,  the  primary  mechanism  for 


the  production  of  these  fields  is  electric  currents  caused  by  Compton  scattering 
of  electrons  by  X-  and  7-rays,  with  the  other  two  mechanisms  having  very  minor 
roles  or  none  at  all.  The  fields  caused  by  nuclear  explosions  are  generally  known 
as  electromagnetic  pulses  (EMP)  (e.g.  Longmire  and  Gilbert  1980,  Longmire  1978). 
As  a  comparison  of  the  energies  involved  in  each  case  would  suggest,  the  fields 
produced  by  nuclear  explosions  are  several  orders  of  magnitude  stronger  than  those 
caused  by  chemical  explosions. 

For  the  case  of  chemical  explosions,  the  dust  induced  electromagnetic  noise 
(DIEMN)  is  capable,  at  the  least,  of  interfering  significantly  with  radio  and  tele¬ 
vision  broadcasts.  The  fields  produced  by  the  compression  of  ionised  gases  can 
initiate  radio-controlled  detonators  or  chemical  explosives.  In  the  case  of  nuclear 
explosions,  the  fields  generated  can  produce  field  strengths  of  several  kV/m  over 
kilometre  distance  scales  and  time  scales  of  milliseconds.  In  the  Johnston  Island 
test  of  1962,  the  fields  created  by  a  nuclear  explosion  seem  to  have  caused  current 
surges  in  electrical  equipment  of  sufficient  magnitude  to  have  triggered  fuses  in  the 
street  lighting  system  in  Honolulu  some  800  miles  distant  (Glasstone  and  Dolan 
1977). 

Lightning  flashes  have  been  observed  to  occur  at  times  up  to  1  second  after 
a  nuclear  explosion  at  distances  of  .9  ~  1.4  km  from  the  explosion  site  (Wyatt 
1980,  Uman  et  ad  1972).  These  flashes  are  presumed  to  have  been  produced  by  the 
dielectric  breakdown  of  the  air  by  the  electric  fields  generated  by  EMP.  The  most 
commonly  used  models  for  EMP  are  unable  to  predict  electric  fields  of  sufficient 
magnitude  (usually  believed  to  be  ~  100  kV/m  )  to  cause  this  breakdown  (Wyatt 
1980,  Uman  et  al  1972). 

Extensive  work  has  been  done  in  the  past  few  years  on  the  theoretical  calculation 
of  EMP  effects  at  various  stages  of  the  explosion.  Particular  interest  has  been  paid 
to  the  EMP  generated  by  an  explosion  close  to  the  surface  of  the  Earth,  especially 
during  the  so-called  quasi-static  phase  during  which  the  rate  of  change  with  respect 
to  time  of  the  electric  and  magnetic  fields  is  sufficiently  slow  that  it  may  be  neglected 
in  Maxwell’s  equations.  It  is  well  known  that  am  electric  field  must  vanish  within 
a  perfect  conductor.  In  the  region  over  which  the  Earth  can  be  considered  to  be 
a  perfect  conductor,  the  quasi-static  EMP  field  at  the  surface  of  the  Earth  should 
be  zero.  This  boundary  condition  is  automatically  satisfied  by  odd  multipoles  of 
the  electric  field.  From  this  condition,  it  has  generally  been  assumed  that  the 
quasi-static  electric  field  produced  by  a  near  surface  blast  can  consist  only  of  odd 
multipoles  of  the  field  throughout  all  space,  (e.g.  Downey  1983,  Grover  1980) 

In  this  paper,  the  complete  set  of  boundary  conditions  for  the  quatsi-static  elec¬ 
tric  field  and  magnetic  fields  at  infinity,  between  the  air  and  the  Earth’s  surface, 
and  between  the  air  and  the  perfectly  conducting  plasma  close  to  am  explosion  site 
are  derived.  The  field  equations,  boundary  conditions,  amd  source  functions  are 
expressed  in  terms  of  spheroidal  and  torsional  vector  fields.  A  general  algorithm 
which  uses  propagator  matrices  and  which  automatically  guarantees  that  all  bound¬ 
ary  conditions  are  satisfied  is  presented,  amd  used  to  obtain  numerical  solutions  to 


Maxwell’s  equations  for  the  electric  and  magnetic  fields  produced  by  a  typical  ex¬ 
plosion.  These  results  are  then  compared  with  experimental  results.  In  particular, 
it  is  found  that  the  boundary  conditions  along  the  surface  of  the  Earth  impose  con¬ 
sistency  conditions  on  all  of  the  multipoles  of  the  electric  and  magnetic  fields,  but 
that  these  conditions  do  not  preclude  the  existence  of  even  multipole  fields. 


2  Maxwell’s  Equations  for  the  Quasi-static  Phase 
of  EMP 


The  time-dependent  Maxwell’s  equations  are 


dt 


v  •  D  =  pe. 

(1) 

V  •  B  =  0, 

(2) 

dB  -  - 

(3) 

—  =  -V  x  E 
dt 

l  =  -J  +  V  x  H, 

(4) 

where  B  is  the  magnetic  induction  in  webers/m2,  E  the  electric  field  in  volts/m,  J 
the  current  density  in  amps/m2,  D  —  eE  the  electric  displacement  in  coulombs/m2, 

_  jtj 

H  =  —  the  magnetic  field  strength  in  amp-m,  pe  the  space  charge  density  in 

coulombs/m3,  6  the  dielectric  permittivity  in  faradays/m,  and  p  the  magnetic  per¬ 
meability  in  henrys/m.  Throughout  the  course  of  this  paper,  we  shall  be  concerned 
only  with  the  calculation  of  the  fields  in  air,  and  hence  e  and  p  will  be  assumed  to 
take  their  free  space  values,  (q  and  po- 

Assuming  that  the  fields  are  evaluated  at  times  late  enough  that  the  fields  are 
nearly  constant  in  time,  eqs.  (3)  -  (4)  become 


V  x  E  =  0, 

(5) 

VxB  =  p0J , 

(6) 

in  air. 

Now,  the  current  density  J  can  be  divided  into  two  parts,  the  source  current 
J,  ,  and  the  conduction  current  Je.  The  source  current  arises  from  the  ionisation 
created  by  the  explosion;  its  exact  form  depends  on  whatever  the  dominant  ioni¬ 
sation  mechanism  is  at  the  time  the  fields  are  evaluated.  For  chemical  explosions, 
this  can  be  the  ionisation  created  by  the  shock  or  collisions  with  dust  particles.  In 
nuclear  explosions,  J,  is  created  primarily  by  Compton  scattering  of  the  electrons 
in  the  air  by  7-  and  X-rays.  Since,  to  a  good  approximation  Ohm’s  law  is  obeyed 
in  air,  one  can  write 


J  —  J,  +  oE 


(7) 


where  Je  =  crE  and  the  conductivity  a  is  measured  in  l/(ohms-m).  In  air,  a  depends 
upon  the  value  of  E  (e.g.  Lee  1980,  Longmire  and  Gilbert  1980).  However,  up  to 
fields  of  strength  ~  100  kV/m,  this  dependence  is  small  and  can  be  neglected. 

According  to  the  Helmholtz-Lamb  decomposition  theorem  any  vector  field  can 
be  represented  as  the  sum  of  a  spheroidal  vector  field,  S  ,  and  a  torsional  vector 
field,  T ,  where 
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S=  E  EC  (8) 


m=—oo  n= 0 


f  =  e  E^m. 
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In  eqs.  (8)-  (11)  ,  U”,  V”,  IV*  ,  are  the  functions  containing  the  radial  dependence 
of  the  vector  associated  with  each  surface  spherical  harmonic,  S™  .  The  surface 
spherical  harmonics  of  angular  order  m  and  rank  n  are  defined  by 

S?(d,<f>)  =  P?(cos6)etm+  (12) 

where  the  associated  Legendre  functions,  P™,  are  given  by 


P™(cos0)  =  (-l)msin"*0 


dmP„(  cosfl) 
d(cosO)m 


and  the  Legendre  polynomials,  Pn  ,  by 


Pn(cos9)  = 


(-1)"  <r(sin2nfl) 
2nn!  d(cos  0)n 


r,  0 ,  and  <j>  the  standard  spherical  polar  co-ordinates  with  the  origin  located  at  the 
site  of  the  original  explosion,  as  shown  in  Fig.  1. 

From  the  orthogonality  conditions  on  spheroidal  and  torsional  fields  (Bullard 
and  Gellman  1954,  Smylie  1965)  it  is  known  that 
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S  Pn,  sin  OdO  d<t>  =  0. 


In  order  to  satisfy  eq.  (5),  the  electric  field  must  be  entirely  spheroidal,  thusly: 


E  =  J2  EC 

m=— oo  n=0 


where 


?»  3£“(r)  BZ(r)as?l»,4,).  E? (r)  <«”(*, *) , 

-  — a^~ s"  (s'*)r  ■  ~7 - aT-*  ■  r>in«  a*  *•  (19) 

Eqs.  (18)  -  (19)  are,  of  course,  equivalent  to  stating  that  the  electric  field  E  must 
be  derivable  from  a  scalar  potential. 

After  substituting  eq.  (7)  into  eq.  (6)  and  taking  the  divergence,  one  obtains 

-  V  •  {oE)  =  V  •  J,.  (20) 

The  substitution  of  eqs.  (18)  -  (19)  into  eq.  (20),  along  with  the  application  of 
eq.  (15),  yields 

<PE?  2o  do  dE™  ,  ,<r  dUj .*  2  m  n(n  + 1)  m 

a— +  (-  +  -)— -n(n+l)^„  -  -—  +  -Uj.n - —Vj.n%  (21) 

where  it  is  assumed  that  the  source  current  density,  J,  ,  is  a  spheroidal  vector  of 
the  form 


■?.=  £  E* 


m— — oo  n=0 


where 


•5*  =  C/JW  ST(M)*  +  (23) 

and  the  conductivity,  a  ,  is  assumed  to  be  a  function  of  r  only.  In  fact,  the  con¬ 
ductivity  exhibits  a  weak  dependence  on  things  like  local  field  strength,  angle,  and 
water  vapour  content  of  the  air.  The  assumption  that  the  conductivity  a  is  a  func¬ 
tion  only  of  r  seems  to  be  adequate  at  late  times,  at  least  as  a  first  approximation 
(Grover  1980). 

Using  eqs.  (2), (6)  -  (17),  and  the  assumption  that  J,  is  entirely  spheroidal,  one 
finds  that 

B=  £  EC  +  S„m,  (24) 


m=  -  oo  n=0 


where 


*'  +  — - gi—‘ 9  +  —94,  *'  (25) 


i 


B"  "  “7£7 — di  s  +  B" (r)  at  *■  (26) 

The  equations  for  the  radial  dependence  of  the  field,  A™  and  J3™  ,  are  given  by 
dM™  2dA"  n(n  + 1)  ,  % 

_ ■  i n  \  /  ii«  r\  /<%• 


+  —7= 

r  dr 


-4T  =  o, 


dEn  |  ^nm _  &!M)  ^■^'n  .  ^A*0  pm  MO  m  ,  _  ,  ,\i r  mi  tno\ 

~ir  +  rB"  -  M^TT)~ir + ~rE ■  - (£7-'-" +  n(n + 1)v>-» > • (28) 

Grover  (1980)  and  others  (e.g.  Hodgdon  1984)  have  derived  similar  equations, 
with  the  important  difference  that  n  in  eq.  (21)  was  only  allowed  to  assume  odd 
values.  This  was  done  in  order  to  satisfy  the  boundary  condition  that  the  radial 
component,  Er  ,  of  the  electric  field  must  vanish  identically  over  the  surface  of  the 
Earth.  However,  as  can  be  seen,  if  U™  and  V£  are  not  identically  zero  for  all  even  n, 
this  ignores  those  multipoles  excited  by  those  current  densities  with  even  values  of 
n.  Since,  in  fact,  the  even  multipoles  of  J,  are  not  all  zero,  another  way  of  satisfying 
the  boundary  conditions  must  exist. 

The  boundary  conditions  on  the  fields  across  the  boundary  separating  two  re¬ 
gions,  1  and  2,  axe  given  by: 

nx(E 2  -  Ey)  =  0,  (29) 

ft  •  (Dz  -  Di)  -  to,  (30) 

ft  •  (Bj  -  Bx)  =  0,  (31) 

ft  x  (H3  -  H{)  =  K,  (32) 

(Jackson  1962,  Stratton  1941).  In  eqs.  (29)  -  (32),  the  variables  with  subscript  1 
refer  to  the  region  1,  and  those  with  with  subscript  2  refer  to  the  region  2.  w  is  the 
surface  charge  density  on  the  boundary  between  the  regions,  K  the  surface  current 
density,  and  ft  the  unit  normal  going  from  region  1  to  region  2. 

If  the  Earth  is  assumed  to  be  a  perfect  conductor,  the  electric  fields  must  vanish 
within  it.  Hodgdon  (1984)  has  pointed  out  that  sufficiently  close  to  an  explosion, 
the  conductivity  of  the  air  first  approaches  and  then  surpasses  that  of  the  Earth. 
This  implies  that  there  is  second  region,  distinct  from  the  Earth,  over  which  the 
boundary  conditions,  eqs.  (29)  -  (32),  must  be  applied:  the  region  around  the  blast 
site  in  which  the  air  is  so  highly  ionised  that  it  can  be  considered  a  perfect  conductor. 
For  simplicity,  it  will  be  assumed  that  this  region  is  a  hemisphere  centred  at  r  =  0 
and  with  a  radius  ro.  Within  that  hemisphere,  the  electric  and  magnetic  fields 
must  vanish  as  well  as  within  the  Earth.  At  this  stage,  it  will  be  assumed  that  the 
Earth  can  be  treated  as  an  infinite  plane,  located  at  $  =  90°,  and  which  is  perfectly 
conducting  for  all  r  >  ro  .  It  will  also  be  assumed  that  all  physical  processes  involved 
in  the  explosion  and  the  field  are  symmetrical  in  the  x  —  y  plane  and  hence  that  the 
resulting  fields  are  independent  of  the  <f>  co-ordinate.  This  implies  that  the  angular 
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order  m  of  the  surface  spherical  harmonics  in  eqs.  (19)  -  (21)  is  always  0,  and  hence 
that  one  is  left  only  with  a  summation  over  the  rank  n. 

Accordingly,  the  boundary  conditions  at  the  edge  of  the  perfectly  conducting 
hemisphere  around  the  blast  site  become: 

hxE,  I  =  0,  (33) 

lr=r0 


ro  =  C7jU 
n  •  .B2I  —  0, 

lr=r0 


n  x  Hi\  =  Ki\  .  (36) 

lr=r0  lr=r0 

In  eqs.  (33)  -  (36),  is  the  surface  charge  density  in  the  air,  Kz  the  surface  current 
density  in  the  air,  and  ail  other  variables  with  a  subscript  2  are  to  be  understood 
to  take  those  values  which  they  would  have  in  the  air.  The  outward  normal  n  to 
the  hemisphere  is  the  unit  vector  f. 

The  application  of  eqs.  (15)  -  (17)  to  eqs.  (33)  -  (36)  yields: 


<L  .-0- 


dE°  _  (2 n  +  1)  r* 


f  Wj(ro,0).P°(cos0)  sinflcM, 

Jo 


®°»u = -$£t)  r*^»)  («> 

for  n^O.  Equation  (27)  for  A™  is  simply  the  radial  part  of  Laplace’s  equation  in 
spherical  co-ordinates,  whose  solution  is  given  by 

A°n  =  anrn  +  bnr^n+1\  (41) 

where  an  and  6n  are  constants  to  be  determined  by  the  boundary  conditions.  Equa¬ 
tion  (41),  taken  in  conjunction  with  eq.  (3S)  and  the  requirement  that  the  magnetic 
field  be  0  as  r  — ►  00  implies  that  an  =  b„  =  0.  This  in  turn  implies  that  there  is 
no  spheroidal  magnetic  field  during  the  quasi-static  phase  of  EMP,  only  a  torsional 
one. 

The  boundary  conditions  eqs.  (33)  -  (36)  degenerate  even  further  for  the  case 
of  the  spherically  symmetric  or  monopole  part  of  the  field  (  i.e.  for  n  =  0).  There 
can  be  no  magnetic  field  associated  with  this  part  of  the  field,  and  so  eq.  (36)  must 
be  satisfied  identically  by  insisting  that  the  spherically  symmetric  part  of  Kj  be 
0.  Equation  (33)  is  automatically  satisfied,  leaving  eq.  (34)  as  the  sole  remaining 
condition. 
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The  corresponding  boundary  conditions  along  the  surface  of  a  perfectly  con¬ 
ducting  Earth  are: 


fi  x  Ei\  =  0, 

10=90°  ’ 

(42) 

h-  Djl  =  G7i, 

10=90°  ’ 

| 

(43) 

ft  •  i?2  =0, 

10=90° 

(44) 

ft  x  Hi  1  =  Ki. 

10=90° 

(45) 

In  eqs.  (42)  -  (45),  C7i  is  the  surface  charge  density  density  along  the  surface  of  the 
Earth,  Ki  the  surface  current  density  along  the  Earth.  Again,  the  variables  with 
a  subscript  2  are  to  be  evaluated  in  the  air.  Incidentally,  for  explosions  over  sea 
water,  the  surface  of  the  Earth  can  be  considered  to  be  a  perfect  conductor  much 
closer  to  the  explosion  site  than  would  be  the  case  for  an  explosion  over  soil. 

The  outward  normal  to  the  surface  of  the  Earth  is  the  unit  vector  along  the  z 
axis.  Using  this,  and  substituting  eqs.  (18),  (19),  (24),  (25)  and  (26)  into  eqs.  (42)  - 
(45),  one  obtains 


S  (  9‘3TJ”  +  S~~df )  lsX, 

“  /  AdE°  .  E°dPn\ 


£  sin«S^(r)^5lf  +  cosSBjfrJ^S  =  Mo  ((i?i  •  ?)f  +  IK,  ■  0)0)  .  (48) 

n=0  a0  a°  goo 


Equation  (44)  is  automatically  satisfied  for  m  =  0.  The  summations  over  the 
odd  Legendre  polynomials  P„  vanish,  leaving  only  the  summations  over  the  even 
polynomials  to  be  satisfied,  thusly: 


(2n)!  dE°,„ 


22n(ft!)2  dr 


izi-ir1 

f»=0 


(2 ft  4-  2)  (2ft  +  1)!  E%n+1  _  gi 
2(2n+2)  ((n  +  l)!)2  r  e0  ’ 


y '(  I\n+1  (2n  +  2)(2n  +  1)!  Q  .  .  _  - 

S  )  2(2n+2)  ((ft  +  l)!)2  2n+1^  1 


The  usual  practice  (Hodgdon  1984,  Grover  1980)  has  been  to  satisfy  the  boundary 

dE° 

condition  by  insisting  =  E°  =  0  throughout  all  space  for  the  even  spherical 
harmonics.  As  was  indicated  above,  this  seems  unlikely  if  the  current  density  de¬ 
pends  to  some  degree  upon  the  even  spherical  harmonics.  What  seems  more  likely 
is  that  the  non-zero  field  at  the  surface  of  the  Earth  draws  charges  there  which 


■■Ml 


arrange  themselves  in  such  a  fashion  so  as  to  cancel  the  inducing  field  at  the  sur¬ 
face,  but  not  necessarily  throughout  all  space.  Essentially,  Eqs.  (49)  -  (51)  impose 
consistency  conditions  on  the  source  current  J,. 

To  sum  up:  in  this  section  we  have  derived  the  equations  governing  the  electric 
fields  induced  by  electric  currents  in  the  atmosphere  from  explosions  of  various 
types.  We  have  shown  how  the  fields  may  be  decomposed  into  multipole  fields 
and  that  where  the  source  and  conduction  currents  are  dependent  upon  particular 
multipoles,  electric  fields  which  are  dependent  on  those  multipoles  are  created.  From 
this  it  follows  that  in  general,  both  even  and  odd  multipole  fields  exist  as  a  result 
of  an  explosion. 

Where  the  conductivity  of  the  Earth  is  sufficiently  high  that  it  may  be  consid¬ 
ered  a  perfect  conductor  with  respect  to  the  air,  the  boundary  condition  on  the  field 
requires  that  the  component  of  the  field  along  the  ground  must  vanish.  For  the  odd 
multipoles  of  the  field,  this  condition  is  satisfied  automatically.  For  the  even  mul¬ 
tipoles,  it  is  satisfied  by  the  appearance  of  a  surface  charge  density  which  produces 
a  field  which  counteracts  the  original  field  at  the  surface  of  the  Earth.  However, 
the  field  which  results  from  the  sum  of  these  two  fields  need  not  be  zero  everywhere 
else,  and  hence  the  even  multipole  fields  can  contribute  to  the  total  field. 

3  Numerical  Methods 

Before  one  attempts  numerical  solutions  of  the  field  equations,  eqs.  (21)  and  (28), 
it  is  necessary  to  know  the  conductivity  a  ,  and  the  source  currents  J,.  Both  of 
these  depend  upon  the  precise  nature  of  the  ionisation  process.  Since  the  most 
interesting  cases  from  a  theoretical  standpoint  occur  when  the  fields  are  produced 
by  a  nuclear  explosion,  it  was  decided  to  choose  expressions  for  a  and  J,  appropriate 
to  a  thermonuclear  explosion.  Hence,  at  this  point  the  further  development  of  the 
field  equations  will  be  confined  to  the  specific  case  of  the  fields  generated  by  a 
thermonuclear  explosion. 

The  total  atmospheric  conductivity  is  composed  of  two  parts:  an  ionic  and  an 
electronic  conductivity.  Each  Compton  recoil  electron  produces  about  about  thirty 
thousand  ion-electron  pairs.  At  early  times,  the  electronic  conductivity  dominates; 
at  late  times,  the  ionic  dominates.  The  expression  for  the  total  conductivity  is  hence 

a  =  <re  +  <7i  (52) 

where  ae  ,  the  electronic  conductivity,  is  given  by 

at  =  e  fie  —  (53) 

ae 

and  <7/  ,  the  ionic  conductivity,  is  given  by 


(Downey  1983,  Wyatt  1980,  Grover  1980).  In  eqs.  (53)  and  (54)  e  is  the  charge  on 
the  electron,  fit  the  electron  mobility,  S  the  local  ionisation  rate,  ae  the  electron 
attachment  rate,  hi  the  ionic  mobility,  and  7/  the  ion-ion  recombination  rate.  The 
ionisation  rate  S  is  assumed  to  have  the  form 


o  exp(-r/A) 
-  60 - -z - 


where  A  is  the  effective  mean  free  path  of  the  gamma  rays,  So  is  a  constant  for 
a  given  time  and  yield,  and  r  is,  as  above,  the  radial  co-ordinate  of  a  spherical 
co-ordinate  system  centred  at  the  blast  site. 

For  convenience,  we  shall  define 


F0(t)  =  -3.9  x  10“”  Y0Na  exp (—8.33  x  102t), 

G0{t)  =  8.2  x  10"”y07Va  exp(— 8.33  x  102t), 

H0{t)  =  -2.8  x  10“2Sy0ATa  exp(— 16.7t), 

F{r,t)  =  [exp(-2.65  x  10"5p0r)  -exp(-1.04  x  10“Vor)]  , 


G(r,t)  =  — exp(— 4.61  x  10“5po»'), 


H{r,t)  =  [exp(-2.20  X  10“5p0r)  -  exp(-4.78  x  HT5p0r)]  , 

X(r,t)=F(r,t)  +  H(r,t), 
y(r,  0  =  16F(r,  f)  +  1.3H(r,t), 

Q{r,t)  =  G(r,t), 

Z{r ,i)  =  ~G(r,t). 


In  terms  of  the  functions  defined  in  eq.  (56),  the  source  current  densities  are 
given  by 


Jr  =  F(r,t)(  1  +  16  cos  9), 
Jg  =  G(r,t)(  1  -  cos  0), 

for  ground  capture  sources,  and 

Jr  =  H(r,  t)(l  +  1.3cos0), 
J»  =0, 


for  air  capture  sources  (Downey  1983).  In  eqs.  (56)  -  (58),  Y0  is  the  total  yield  in 
kilotons,  Na  is  the  number  of  neutrons/  kiloton,  po  is  the  air  density  in  mg/cm3, 
r  is  the  radial  distance  from  the  blast  in  centimetres,  9  is  the  polar  angle,  t  the 
retarded  time  in  seconds,  Jr  the  radial  current  density  in  abamps/cm2  ,  and  Jg  the 
polar  current  density  in  abamps/cm2.  The  total  current  density  at  any  retarded 
time  t  must  be  the  vector  sum  of  eqs.  (57)  -  (58)  .  Hence,  the  components  of  the 
source  current  density  are 


Jr  =  X (r,  t)  +Y  (r,  t)  cos  9 , 
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Jt  =  Q{r ,  t)  +  Z  (r,  t)  cos  0. 

Using  eqs.  (56)  -  (59)  and  eqs.  (15)  -  (17)  one  finds  that: 


Uj.°0  =  X(r), 
Vj.°0  =  0, 
Uj.\  =  Y(r ), 


Vj.\  =  ~ ~Q{r ), 


E7J.«  =  0,n>l, 


y  o  _  _  (2n  +  l)?r  _ (2n  2/;)!(n  2fc) _ 

J,n  2 n(n  +  1)  U  J  22'‘-2tJfc!(n  -  'c)!(n  -  2k  +  2)  ((n/2  -  A:)!)2 


for  n  even  ,  n  >  2  , 

x  iW  (n-1)/2 


°  _  _(2n_j-l)jr  .  . 1  yi/  /_■,%* _ (2n  2fc)! _ 

/,n  2n(n  +  l)V^j  ~  1  ’  22"-2*-1fc!(n-fc)!(((n-l)/2-Jk)!)2(n-2Jfc  +  l) 


for  n  odd  ,  n  >  3  . 


By  substituting  eq.  (61)  back  into  eqs.  (21)  and  (28),  it  is  now  possible  to  solve 
numerically  for  the  radial  part  of  the  electric  potential  and  the  electric  and  magnetic 
fields.  It  should,  however,  be  noted  that  eq.  (61)  must  be  converted  into  amps/m2  in 
order  to  be  consistent  with  the  expression  for  the  conductivity.  It  is  generally  most 
convenient  in  numerical  solutions  of  differential  equations  to  use  scaling  factors  to 
form  dimensionless  equations.  By  defining 


dE°  ML 
dr  ~VlQT 2’ 


K  =  y* 


QT 2  ’ 


Mo  V%TL' 


r  =  r'L, 

t  =  rr, 

.tq> 


ML3' 


Uj.°n  =  (U}.°n) 


v,°  =  (V*0>\ 

r  Jtn  \  J,n)  y£2  ’ 


F°  =  (f*°)  - 
n  \  "/  tl3' 


where 


Q  =  L*T  (K(ro))  , 
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•S;* 

V’ 

!«*.! 

•M 


iwirawwjvuvi 


M  =  LST 


[W'(i)L- 


pO  *V>.n  .  2,r  0  0 

K  -  +  -Vj.„ - ; — v>- 

and  L  ,  T  ,  M,  and  Q  are  the  scaling  fetors  in  MKS  units  for  length,  time,  mass 
and  electric  charge,  respectively,  with  r0  being  the  smallest  value  of  r  that  appears 
in  the  integration,  one  is  allowed  to  specify  any  two  of  L  ,  T  ,  M  and  Q  as  free 
parameters.  It  was  found  to  be  most  convenient  to  specify  T  =  16.7  secs,  and  L 
as  twice  the  maximum  value  of  r  used  in  the  integration.  Using  the  dimensionless 
quantities  defined  above,  the  field  equations,  eqs.  (21)  and  (28)  become 


dyi  -l  (2c*  dc*\  .  y-t 

-j—  =  —  - +  -7-  yi  +  n(n  +  1)-^ 

dr*  <t*  V  r*  dr*  J  v  ’  r*2 


(n) 


dy2 

d^  =  yi’ 


dys  _  a*  ,  2ys  1  f TTm  0  t  ,  ,XT,,0\ 

dP  ~  J^TTT)91  +  Pn  ~  ~  ~  ^TTT)  (u'~  +  n(n  +  1)v>~)  • 

Equations  (64)  form  a  system  of  linear  differential  equations  which  can  most 
easily  be  solved  by  means  of  the  propagator  matrix  formalism  (Gilbert  and  Backus 
1966,  Smylie  and  Mansinha  1971).  In  a  system  of  n  linear  homogeneous  differential 
equations 

^  =  A(r)  •  7W,  (65) 

where  A(r )  is  the  matrix  of  coefficients,  anrtxn  matrix  F(r)  is  called  a  fundamental 
matrix  of  the  system  eq.  (65)  if  it  satisfies  the  condition 

=  A{r)  ■  Hr),  (66) 

and  has  an  inverse  for  every  r  in  its  domain.  Now,  F[r)  =  F(r,  r*)  is  called  the 
propagator  matrix  of  eq.  (65)  if 

F(r.)  =  P(r„r,)  =  /  (67) 

where  I  is  the  identity  matrix. 

It  follows  (Gilbert  and  Backus  1966),  among  other  things,  that 
P{riiri-i)  =  P(ri,r(- 1)  • 

P(riyri-i)  =  P~l(ri- i,n),  (68) 

/(r)  =  Hr^i)  •  /(r«)» 

where  /(r)  is  the  solution  to  eq.  (65)  and  /(r,)  is  the  solution  at  r  =  r,. 
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Now,  the  system  of  non-homogeneous  equations, 

=  A(r)  •  /(r)  +  y(r),  (69) 

can  be  shown  to  have  the  solution 

/(r)  =  f  P(r,f)  ■  g(c)  d$  +  P(r,r.)  •  /(r.)  (70) 

where  /(r,)  is  a  solution  to  the  non-homogeneous  system  ,  eq.  (69).  The  boundary 
conditions  eqs.  (37)  -  (40)  and  (49)  -  (51)  would  overdetermine  the  system  of  equa¬ 
tions  (64)  if  072,  07j,  Ki,  and  Ki  were  known.  Since  they  axe  not,  the  remaining 
boundary  conditions  are  sufficient,  and  the  unknown  functions  can  be  calculated 
from  the  general  solution  to  eq.  (64).  In  terms  of  the  variables  defined  in  eq.  (62), 
the  relevant  boundary  conditions  are: 
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o 

(71) 

f'l  l)n  (2n)!  u(2n)-0 

U  ’  2’"(n!)’!'1  -° 

(72) 

where  y[2nHs  the  solution  to  the  first  of  eqs. (64)  associated  with  the  2nth  harmonic. 

The  boundary  condition  eq.  (71)  is  satisfied  if  the  initial  solution,  y(r0)  is  chosen 
thusly: 

'  /Ci  ' 

y(ro)  =  0  (73) 

/C2 

It  was  decided  to  set  r0  =  .4  km.  At  that  point,  the  conductivity  a  of  the 
air  is  approximately  equal  to  that  which  is  typical  for  the  Earth’s  surface  ~  10~3 
(ohms-meter)-1.  In  order  to  be  consistent  with  the  boundary  conditions  applied  at 
the  surface  of  the  Earth,  the  air  conductivity  a  should  be  regarded  as  infinite  there, 
and  hence  the  boundary  conditions,  eqs.  (37)  -  (40),  apply.  The  constants  Ki  and 

dE°  n 

/c2  are  determined  by  the  condition  that  — >  0  and  B°  — ►  0  as  r  — ►  oo.  Since  the 

fields  were  found  to  be  small  for  r  >  5.0  km,  (Hodgdon  1984,  Downey  1983),  it  was 
decided  to  apply  the  boundary  condition  yi  =  ys  =  0  at  values  of  r  greater  than 
that.  For  example,  for  n  =  1,  the  boundary  conditions  were  applied  at  r  =  8.1  km, 
for  n  =  2  at  r  =  7.0  km,  and  so  on.  As  a  check,  the  integration  was  reperformed 
ab  initio  and  extended  out  roughly  twice  as  far,  at  which  point  new  values  for 
and  k2  were  calculated.  Typically,  difference  between  the  two  values  for  the  fields 
was  found  to  be  <  \%  for  r  <  4.0  km,  rising  to  about  3%  at  r  =  6.0  km.  It  should 
be  noted  that  the  use  of  the  boundary  condition  at  infinity  at  a  finite  value  of  r 
implies  that  yi  and  hence  the  tangential  field  need  not  be  zero  there,  as  in  fact  it 
is  not  in  general.  However,  its  magnitude  is  sufficiently  small  at  the  point  at  which 
the  boundary  condition  is  applied  that  it  may  be  neglected. 


It  should  be  noted  that  the  equations  developed  above  must  be  modified  slightly 
when  n  =  0  .  In  that  case,  B$  =  0,  and  eq.  (28)  becomes 

dE°  =  Uj.°q 
dr  a 

As  one  would  have  expected,  eq.  (74)  can  also  be  obtained  from  the  integration  of 
eq.  (21)  with  n  =  0. 

Equations  (64)  were  solved  using  the  propagator  matrix  formalism  of  eqs.  (65)  - 
(70)  and  a  four-point  Runge-Kutta  algorithm  with  automatic  error  controls  to 
obtain  the  propagator  matrix.  The  expression  f)  •  g(f)  d$  was  evaluated 

using  Simpson’s  second  rule  with  various  base  point  spacings  and  an  algorithm 
in  which  global  errors  were  controlled  by  halving  the  base  point  spacing  until  the 
largest  difference  between  successive  iterations  was  less  than  .2  kV/m  in  the  electric 
fields  for  n  =  1  and  less  than  .02  kV /m  for  all  other  multipoles.  The  Runge-Kutta 
routines  were  checked  for  convergence  by  decreasing  the  upper  error  bound  from 
10-9  to  10-1°. 

4  Numerical  Results  and  Analysis 

Figures  2-5  show  the  radial  and  tangential  electric  fields,  electric  potential  and 
torsional  magnetic  field  associated  with  the  dipole  (i.e.  for  n  =  1  in  eqs.  (21),  (27) 
and  (28))  as  a  function  of  the  radial  co-ordinate  r  at  various  angles  for  a  nuclear 
explosion  of  10  megatons  evaluated  at  a  retarded  time  of  1  msec  after  the  blast. 
Unless  otherwise  stated,  it  will  henceforth  be  assumed  that  all  of  the  fields  discussed 
in  this  section  are  evaluated  at  the  same  retarded  time  of  1  msec,  and  that  the 
source  currents  are  those  generated  by  a  10  megaton  thermonuclear  explosion  (i.e. 
Yo  =  104  in  eqs.  (56)).  One  also  needs  to  have  values  for  So,  Po,  Na,  ae,  He,  7/, 
and  m.  Following  Grover  (1980),  S0  in  eq.  (55)  was  set  to  1.1  x  10so  ion-pairs/m¬ 
sec,  a  value  appropriate  to  a  10  megaton  burst.  The  values  assumed  for  the  other 
quantities  were  also  those  chosen  by  Grover  (1980): 

Na  =  2.0  x  1023  neutron/kT, 
po  =  1.225  mg/cm3, 
a ,  =  1.5  x  108  sec-1, 

He  =  -25  (m2/V-sec), 

7/  =  2.0  x  10-12  m3/sec, 

Hi  =  2.5  x  10-4  (m2/V-sec). 

In  reality,  these  values  depend  upon  things  like  the  field  strength,  air  density  and 
fraction  of  water  vapour  present.  However,  the  average  values  will  suffice  as  a  first 
approximation.  The  gamma  dose  attenuation  length  A  was  set  to  320  metres  for  all 
calculations.  Unless  otherwise  stated,  ro  will  be  assumed  to  be  .4  km  throughout. 


Figure  6  shows  the  monopoie  electric  field  (i.e.  for  n  =  0  in  eq.  (21)).  Figures 
7-10  show  the  quadrapole  fields  and  potential  (i.e.  for  n  =  2  in  eqs.  (21),  (27)  and 
(28)).  Figures  11-14  show  the  fields  and  potential  for  the  sextopole  fields  (i.e.  for 
n  =  3  in  eqs.  (21),  (27)  and  (28)).  The  corresponding  graphs  for  the  higher  order 
multipoles  show  essentially  the  same  behaviour  as  those  for  n  =  3  and  so  are  not 
reproduced  here. 

These  graphs  demonstrate  several  features  of  interest  about  the  quasi-static 
electric  and  magnetic  fields.  Firstly,  it  is  evident  that  the  magnetic  fields  produced 
by  the  source  currents  are  relatively  weak.  The  strongest  magnetic  field  is  the  one 
associated  with  the  dipole  (n  =  1  multipole  ),  which  has  a  maximum  of  only  ~  20% 
of  the  typical  strength  of  the  geomagnetic  field.  This  justifies  a  posteriori  the  neglect 
of  self-  consistent  effects  in  the  calculation  of  the  fields,  since  such  effects  could  only 
be  a  small  perturbation  to  the  main  fields. 

As  one  would  have  expected  from  the  expressions  for  the  source  current  densities, 
eqs.  (61),  the  n  =  1  fields  dominate  the  others.  However,  both  the  n  =  0  field  and 
the  higher  order  multipoles  have  non-negligible  field  strengths,  especially  close  to 
the  origin.  For  example,  the  n  =  0  electric  field  has  a  peak  value  of  ~  10%  of  the 
n  —  1  field,  and  the  n  =  2  field  a  peak  value  of  ~  4%  of  the  n  =  1  field. 

Figures  15-16  show  the  sums  of  the  radial  and  tangential  electric  fields  for  the 
multipoles  from  n  =  0  to  n  =  5.  As  expected,  the  dipole  (n  =  1)  field  is  the. 
dominant  influence,  for  both  the  radial  and  tangential  fields,  except  for  the  total 
radial  field  at  9  —  90°.  There,  since  all  of  the  multipoles  with  odd  values  of  n  are 
identically  zero,  the  multipoles  for  even  values  of  n  are  the  only  non-zero  fields. 

That,  of  course,  presents  a  problem  since,  from  eq.  (49),  the  sum  of  the  radial 
fields  must  be  zero  along  the  surface  of  the  Earth  i.e.  for  9  =  90°.  Figure  17  shows 
the  degree  to  which  this  consistency  condition  is  violated.  The  solid  line  is  the  sum 
of  the  radial  fields  at  0  =  90°  for  n  =  0,2,4,  with  the  n  =  0  field  being  calculated 
using  eq.  (74).  The  dotted  line  shows  the  value  which  the  n  =  0  field  would  have  to 
have  in  order  to  cancel  the  n  =  2  and  n  =  4  fields  at  9  =  90°.  Since  multipoles  of 
order  higher  than  4  add  relatively  little  to  the  total  fields,  their  contributions  have 
been  neglected. 

From  the  preceding  discussion,  it  is  clear  that  the  expressions  for  the  source 
current  densities,  eqs.  (56)  -  (60),  or  for  the  conductivity,  eqs.  (52)  -  (55),  or  both, 
are  inconsistent  as  they  stand  and  must  be  altered.  The  approach  taken  in  Heaton 
(1987)  was  to  assume  that  the  expressions  for  the  source  current  densities  and 
conductivities  were  correct,  but  that  the  boundary  conditions  were  satisfied  at  the 
surface  of  the  Earth  by  an  induced  electric  potential.  While  possible  in  principle, 
this  method  leads  to  considerable  numerical  difficulties.  An  alternative  approach 
would  be  to  consider  the  physics  of  the  situation  more  closely.  Equations  (52)  - 
(60)  have  been  derived  from  fits  to  experimental  data  (Downey  1983) ;  accordingly, 
it  is  reasonable  to  seek  the  solution  which  changes  them  the  least.  An  inspection 
of  eqs.  (59)  -  (60)  reveals  that  X(r,t)  and  Z(r,t)  have  the  least  influence  on  the 
final  field  strengths,  and  of  those  two,  X(r)  affects  only  the  value  of  the  n  =  0 


Table  1:  Comparisons  of  Calculated  EMP  Electric  Fields 


Radius  Total  Field 

Total  Field 

Total  Field 

Total  Field 

Total  Field 

5  =  0 

9  =  0 

5  =  0 

5  =  0 

(Wyatt  1980) 

(Grover  1980) 

(Downey  1983) 

(Heaton  1987) 

(Present  Work) 

(m)  (kV/m) 

(kV/m) 

(kV/m) 

(kV/m) 

(kV/m) 

500 

390 

45 

23 

304 

100 

900 

164 

21 

19 

128 

58 

1300 

114 

15 

13 

63 

25 

field.  Accordingly,  it  seems  most  reasonable  to  estimate  the  true  field  strength  by 
replacing  the  value  of  X(r,t)  given  in  eq.  (56)  by  crA(r)  where  A (r)  is  the  function 
graphed  by  the  dotted  line  in  Fig.  17. 

Figures  18-19  show  the  total  fields  obtained  by  replacing  the  value  of  the  n  =  0 
field  as  given  by  eq.  (74)  by  A(r)  and  summing  from  n  =  0  to  n  =  5,  as  before.  A 
comparison  of  Figs.  15  and  18  shows  that  the  effect  of  ensuring  that  the  boundary 
conditions  along  the  surface  of  the  Earth  are  satisfied  in  this  fashion  is  to  decrease 
the  peak  value  of  the  field  by  ~  9%. 

One  test  of  any  model  of  EMP  is  whether  it  is  capable  of  producing  fields  of 
sufficient  intensity,  usually  regarded  as  being  in  excess  of  100  kV/m,  to  cause  the 
lightning  observed  during  several  tests.  As  can  be  seen  from  Figs.  2-4,  the  field  for 
the  n  =  1  multipole  reaches  a  maximum  of  ~  118  kV/m  at  .4  km  and  falls  to  less 
than  1  kV/m  at  5.1  km.  It  is  well  known  (e.g.  Hodgdon  1984,  Longmire  and  Gilbert 
1980)  that  the  dominant  field  is  dipolar  because  of  the  cos  9  dependence  of  the 
current  density.  Hence,  the  fields  displayed  in  Figs.  2-4  should  constitute  the  greater 
part  of  the  total  electric  field.  It  is  encouraging  that  the  magnitudes  calculated  are 
around  those  needed  to  produce  nuclear  lightning  over  some  of  the  range  in  which 
they  were  observed  (900  -  1400  m  from  the  blast)  at  time  scales  of  1  msec  (Wyatt 
1980).  Wyatt’s  values  for  the  fields  are  listed  in  Table  1,  and  compared  with  the 
ones  obtained  here,  as  well  as  with  Heaton’s  (1987),  Downey’s  (1983)  and  Grover’s 
(1980)  values  for  the  total  fields.  These  values  are  necessarily  adequate  only  for 
order  of  magnitude  comparisons,  because  of  the  angular  dependence  of  some  of  the 
field  values.  As  can  be  seen,  the  results  from  the  current  work  are  considerably  larger 
than  either  Grover’s  or  Downey’s  results,  and  considerably  smaller  than  Wyatt’s  or 
Heaton’s  (1987)  values,  reaching  the  100  kV/m  level  only  at  500  m.  In  Downey’s 
and  Grover’s  cases,  the  results  are  likely  attributable  to  the  different  conductivity 
models  used.  Downey  used  detailed  fits  to  the  expected  form  of  the  conductivity, 
taking  into  account  the  air  chemistry,  as  opposed  to  Grover’s  more  approximate 
model.  Even  so,  Downey  only  found  a  variation  of  10%  -  30%  between  his  values  and 
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Grover’s.  The  major  reason  for  the  variation  in  the  calculations  of  the  magnitudes 
of  the  electric  fields  seems  to  be  the  boundary  conditions  at  r0.  Grover  and  Downey 
felt  that  the  condition  that  the  field  should  vanish  at  r  =  0  required  that  both  the 
radial  and  tangential  electric  fields  be  zero  at  r0,  the  point  at  which  the  integration 
is  begun.  While  this  satisfies  the  boundary  conditions  there,  the  condition  is  rather 
more  restrictive  than  required.  Wyatt  neglected  the  boundary  conditions  at  r0  as  did 
Heaton  (1987).  These  last  two  sets  of  results,  then,  assume  that  the  region  of  very 
high  conductivity  was  sufficiently  far  from  the  region  of  interest  that  its  effects  on  the 
field  would  be  insignificant.  The  results  in  this  paper  are  intermediate  between  the 
two  groups  of  results,  predicting  fields  lower  than  those  of  Wyatt  and  Heaton  (1987) 
but  higher  than  those  of  Grover  and  Downey.  The  boundary  conditions  near  the 
explosion  site  chosen  in  this  paper  are  more  restrictive  than  Wyatt’s  and  Heaton’s 
(1987)  but  less  restrictive  than  Grover’s  and  Downey’s.  Wyatt  and  Heaton  (1987) 
essentially  applied  no  boundary  conditions  to  the  fields  near  the  explosion  while 
Grover  and  Downey  required  that  both  the  radial  and  tangential  components  of 
the  electric  field  vanish  near  the  explosion  site.  The  boundary  conditions  developed 
here  for  that  region,  eqs.  (37)  -  (40),  require  that  only  the  tangential  electric  field 
vanish  near  the  explosion  site.  All  this  implies  that  the  inner  boundary  conditions, 
eqs.  (37)  -  (40),  have  a  significant  effect  on  the  magnitudes  of  the  fields,  even  far  from 
the  perfectly  conducting  region,  contrary  to  the  assumptions  of  Wyatt  and  Heaton 
(1987).  This  is  borne  out  by  an  examination  of  the  effects  that  the  conductivities  of 
the  Earth  and  the  perfectly  conducting  hemisphere  around  the  explosion  site  have 
on  the  magnitude  and  location  of  the  maximum  electric  field  strength.  When,  as  a 
test,  r0  was  successively  set  to  several  different  values,  the  maximum  value  of  the 
n  =  1  electric  field  was  reduced  in  all  cases.  For  r0  =  .5  km  the  maximum  field 
strength  was  116  kV/m  and  for  r0  =  .6  km,  it  was  100  kV/m.  For  both  cases, 
the  maximum  occurred  at  r0.  When  the  value  of  r0  was  decreased,  the  maximum 
electric  field  strength  decreased  more  noticeably,  and  was  not  always  located  at  r0. 
For  example,  for  r0  =  .01  km,  the  maximum  field  strength  was  78  kV/m  at  .41  km, 
with  the  field  strength  at  ro  being  only  28  kV/m.  With  r0  =  .1  km,  the  maximum 
field  strength  was  80  kV/m  at  .4  km,  and  at  r0,  the  field  strength  was  57  kV/m.  It 
should  be  noted  that  the  expressions  for  the  conductivity,  eqs.  (52)  -  (55),  are  not 
really  valid  for  r  <  .4  km,  and  so  the  results  within  that  region  should  be  regarded 
merely  as  being  suggestive  rather  than  definitive. 

These  results  can  best  be  understood  by  a  consideration  of  the  conductivity  and 
source  current  models  employed.  Increasing  the  value  of  r0  is  essentially  equivalent 
to  decreasing  the  ground  conductivity.  The  maximum  value  of  the  fields  decreased 
slightly  when  this  was  done  because  that  part  of  the  source  current  density  at  values 
of  r  <  r0  was  not  included  in  the  calculation,  thus  reducing  the  total  field  strength. 
Decreasing  the  value  of  ro  is  equivalent  to  increasing  the  conductivity  of  the  ground. 
The  vanishing  of  the  tangential  electric  field  at  r0  forces  a  proportionately  smaller 
discontinuity  in  the  potential  as  r0  decreases,  hence  producing  a  weaker  radial  field 
there.  One  would  expect  the  air  around  the  explosion  site  to  be  divided  into  roughly 
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three  regions:  a  completely  ionised  one,  a  partially  ionised  one,  and  an  almost  com¬ 
pletely  unionised  one.  The  boundaries  between  these  regions  will  obviously  change 
at  different  rates  with  respect  to  time.  The  unionised  and  partially  ionised  regions 
will  move  inwards,  and  the  completely  ionised  region  will  eventually  vanish  alto¬ 
gether.  The  results  obtained  above  suggest  that  the  widths  of  these  three  zones 
and  the  structure  of  the  transitions  from  one  to  the  other  are  crucial  for  the  deter¬ 
mination  of  the  magnitudes  of  the  fields.  Since  the  maximum  values  of  the  fields 
are  near  to  those  required  to  initiate  dielectric  breakdown  in  the  air,  it  may  be 
that  relatively  small  changes  in  atmospheric  or  ground  properties  might  produce 
conditions  favourable  for  the  occurrence  of  nuclear  lightning  in  one  instance,  and 
unfavourable  in  the  next. 

5  Conclusions 

In  this  work,  it  has  been  demonstrated  that  the  quasi-static  electric  fields  produced 
by  an  explosion  contain  components  that  depend  on  both  the  odd  and  even  surface 
spherical  harmonics,  and  that  this  remains  true  even  if  the  explosion  occurs  near  a 
good  conductor. 

Expressions  for  the  excitation  function  for  the  EMP  in  terms  of  the  surface 
spherical  harmonics  were  obtained,  and  used,  along  with  a  simple  model  of  ionic 
and  electronic  conductivity,  to  obtain  values  for  the  electric  and  magnetic  fields 
generated  by  a  typical  explosion.  A  propagator  matrix  algorithm  for  solving  the 
EMP  equations  was  developed.  Using  this  algorithm,  it  was  demonstrated  that  the 
dominant  electric  and  magnetic  fields  are  dipoles,  but  that  the  contribution  of  the 
other  multipole  fields  to  the  total  field  is  significant.  In  particular,  the  calculated 
values  of  the  field  were  near  those  which  are  expected  to  produce  the  lightning 
which  has  been  observed  to  accompany  nuclear  explosions.  The  results  obtained 
here  suggest  that  the  detailed  structure  of  the  ionised  regions  around  the  explosion 
site  is  crucial  to  the  existence  of  nuclear  lightning,  and  that  relatively  small  changes 
in  a  few  parameters  might  be  sufficient  to  permit  its  formation. 

It  is  shown  that  the  values  which  were  used  for  the  source  currents  lead  to  values 
of  the  electric  fields  which  do  not  satisfy  the  boundary  conditions  over  the  surface 
of  the  Earth,  and  hence  should  be  modified  to  take  into  account  the  boundary 
conditions  on  the  fields. 

Efforts  are  currently  being  made  to  extend  this  work  by  incorporating  more 
accurate,  self-consistent  models  for  the  conductivity  and  source  currents,  perhaps 
by  the  introduction  of  Monte  Carlo  techniques  into  the  algorithms.  As  well,  it  is 
planned  to  modify  the  boundary  conditions  to  take  account  of  the  large  but  finite 
conductivities  in  the  Earth  and  around  the  blast  site,  and  extend  the  propagator 
matrix  formalism  to  the  time-dependent  case. 
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SPLINE-BASED  FINITE-ELEMENT  METHOD  FOR  SOLVING 
A  STEFAN'S  PROBLEM  IN  A  FINITE  DOMAIN  -  FORMULATION 


Shunsuke  Takagl 

U.S.  Army  Cold  Regions  Research  and  Engineering  Laboratory 

Hanover,  NH  03755 


ABSTRACT.  The  finite-element  method  presented  in  this  paper  has  two 
features.  First,  a  cubic  spline  is  included  in  the  basis  functions;  the 
advantage  of  the  inclusion  has  not  yet  been  examined.  Second,  space  coordi¬ 
nates  only  are  used  to  determine  the  temperature  distribution.  The  time- 
coordinate  increment  of  the  phase  front  corresponding  to  the  space-coordinate 
increment  can  be  determined  consequently.  In  our  problem,  where  the  final 
position  of  the  phase  front  may  be  determined  at  the  start  of  the  solution, 
the  solution  method  using  a  space-coordinate  sequence  is  preferred  to  the  one 
using  a  time-coordinate  sequence.  This  numerical  method  can  work  smoothly 
even  for  an  extremely  large  time. 

Analytical  formulation  only  is  presented  in  this  paper. 


I .  INTRODUCTION .  Instead  of  the  two  end  conditions  usually  employed 
for  determining  the  two  extraneous  unknowns.  Sections  1  introduces  two 
internal  conditions  and  develops  a  cubic  spline  without  end  conditions. 
Section  Ilthen  demonstrates  that  a  cubic  spline  interpolating  the  unknown 
temperatures  produces  a  set  of  roof-shaped  basis  functions.  Equidistant  mesh 
points  are  used  for  the  derivation. 

We  have  applied  this  finite  element  method  for  solving  the  freezing  of 
water  in  a  finite  domain.  The  interface  is  always  chosen  as  a  mesh  point. 

The  ice  and  water  domains  are  divided  into  equilength  subregions,  with  n  and 
m  internal  points,  respectively,  where  n  and  m  may  be  any  integers  larger 
than  or  equal  to  2.  Therefore,  the  mesh  points  are  never  fixed  in  this 
method. 

Section  III  states  the  problem  to  be  solved.  The  temperature  distribu¬ 
tions  are  determined  in  Section  IV  by  using  the  interfacial  coordinate  £ 
in  place  of  time  t.  On  the  assumption  that  the  temperature  distributions  at 
the  time  substitute  £  are  known.  Section  V  finds  the  simultaneous  equations 
for  the  unknown  temperatures  at  the  time  substitute  £+d£,  showing  that  they 
are  quadratic.  Application  of  Newton's  approximation  reduces  solving  a  set 
of  simultaneous  quadratic  equations  to  sequentially  solving  sets  of 
simultaneous  linear  equations  representing  tangent  planes  of  the  quadratics. 
The  time-coordinate  increment  dt  corresponding  to  the  space-coordinate 
increment  d£  can  be  found  by  use  of  the  expression  of  d£/dt. 

In  our  problem,  where  the  terminal  temperatures  are  both  given,  the 
final  interfacial  position  can  be  found  at  the  start  of  the  solution.  He  can 
choose,  therefore,  an  appropriate  magnitude  of  increment  d£  at  any  stage  of 
the  solution.  The  solution  method  using  a  space-coordinate  sequence  is  more 
convenient  in  our  problem  than  the  customary  method  using  a  time-coordinate 
sequence.  It  was  experienced  in  the  numerical  computation  of  the  analytical 


515 


r  rvxrjtruifm  ">  ’’J*  w  runuiiuijuraiinjmjni] 


solution  (Kef.  1)  that  the  former  works  smoothly,  even  for  an  extremely  large 
time. 


Analysis  only  is  presented  in  this  paper.  The  advantage  of  using  a 
spline  function  in  a  finite-element  method  has  yet  to  be  clarified. 


II.  CUBIC  SPLINE  WITHOUT  END  CONDITIONS.  Instead  of  the  two  end 


conditions  that  are  usually  stipulated,  we  adopt  two  internal  conditions. 


vl  1 1 

yxx  -  0 

V*  *  • 

y*l  + 

_» »» 

Vi  - 0 

m 

y'" 

Vi 

+  0 


for  determining  a  cubic  spline  that  passes  through  equally  spaced  points 
P0(x0  ’  PN(XN  *  7NJ*  SelectlnS  y['  »  denoted  by  z ±,  at  point  P£ 

as  unknowns  (i  -  0,...,  N),  the  two  internal  conditions  become 


*0  •  2zl  *  z2 


ZN-2  ‘  2Z»-1  +  '  ° 


In  other  words,  we  adopt  a  single  cubic  passing  through  points  Pq»  P^ ,  P2 
and  another  through  points  P{j-2>  PN-1 >  PN*  The  minimum  of  N  in  this  stipula¬ 
tion,  therefore.  Is  4,  if  the  two  cubics  should  not  be  the  same.  If  the  two 
cubics  can  coincide,  N  may  be  3. 


Thus  determined,  the  equations  for  z^  are: 


z0  "  2zl  +  z2  "  0 


z0  +  4zl  +  z2  "  Al 


Z1  +  4z2  +  z3  "  h 


[0] 

[1] 

[2] 


ZN-3  +  4zN-2  +  ZN-1  “  AN-2 


N-2 


+  4z„  ,  +  z 


N-l 


!N  “  V-] 


[N-2] 

[N-l] 


“H 


for  i  “  1, ... , 
denoted  by  Ax. 


N-l.  The  equal  distance  between  two  adjacent  mesh  points  Is 


The  general  solution  of  the  above  equations  can  be  found  for  N  6. 
Define 


“  1/6  , 


9 


and 


4f 


1-1 


-  f 


1-2 


recursively  for  1^4.  The  difference  equation  (2)  is  solved  to: 
-  (l/<2/3)){(2  +  /I)1'1  -  (2  -  /3)i+1} 


(2) 


[  (l-2)/2] 

M 

P"° 


f  1  -  Is  ,i-2-2p 

L2p  +  1J  Z 


(3) 


which  happens  to  be  valid  for  1  >  2. 

When  N  »  2r  with  r  3,  we  may,  as  proved  below,  transform  the 

simultaneous  equations  to: 


Z0 


2zl" 


[01* 


-flAl 


f3Z2  +  f2Z3 


I  (-I)2’1  f,  A, 


[1]' 

12]’ 


f  z  i  +  f  ,  z 
r  r-1  r-1  r 


r-1 

l  (-1) 
1 


r-1-1 


fi  Ai 


lr-1]’ 
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z  .  +  4z  +  z  . , 
r-1  r  r+1 


f  .  z  +  f  z  . . 
r-1  r  r  r+1 


r-1 

I  (-D 
1 


r-l-i 


[r+1]’ 


f2  ZN-3  +  f3  *N-2 


l  (-i)2'1  *±  Vl 


In— 2 J ' 


fi  Shi 


[N-l] ' 


2zN-1  “  ZN-2  * 


The  relation  [0]'  ia  found  from  [0].  The  relation  [1]’  is  found  by 
subtracting  [0]  from  [1].  Substituting  the  relation  [ 1 J  *  into  [2]  ,  we  find 
[2]'  .  Assume  that  the  relation 


Vi  -t’Wi  ■  J  ‘-1)  £i  Ai 


is  valid  for  k  >  2.  Substituting  z^.  by 


*k '  -  ‘Vi  -  Vi  +  Vl 


we  find 


(4fk+l  "  fk)zk+l  +  fk+l  zk+2  "  J  fi  Ai 


Therefore  if  (2)  is  true,  [r-1]'  must  also  be  true.  We  apply  the  same 
procedure  from  below  starting  at  [N-lJ1  ,  and  find  [r+1]'  . 


We  solve  the  three  equations  in  the  middle  for  z7-i  ,  zr  ,  and  z^i  • 
Then  the  rest  of  the  unknowns  can  be  successively  determined.  We  express  the 
solution  in  terms  of  a  (N+l)  by  (N+l)  matrix  a^  defined  by 


(Ax)2  zt  -  I  aj  yk  ,  i£{0,...,N}  . 


H 


.V.V.'ftvVCv .VCvis!,' V* 
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Here  a  set-theoretic  notation  (Ref.  2,3)  Is  used  to  show  that  1  Is  a  member 
of  the  set  of  numbers  enclosed  In  a  pair  of  braces.  Similar  notations  will 
be  used  henceforth.  The  solution  is  shown  in  Appendix  A. 


When  N  *  2r  +■  1  with  r  _>  3,  we  transform  the  equations  [0]...[N]  to 


f3*2  +  f2z3 


2ll  ‘  z0 


»  f  A 
1  1 


2 

l  <-l> 

1 


2-i 


fi  Ai 


[0]" 

[1]” 

[2]" 


.  r 


fr+l  zr  +  fr  Vl 


f rZr  +  fr+l  zr+l 


-  I  (-I)1*1  f4  A± 


l  (-1)  ft  an-1 


[r]” 


[r+1]" 


f 2  3  +  f3  ZN-2 


N-l 


I  c-D2'1  ft  Vi 


■  £i  Vi 


[N-2 J ' ' 
[N-l]” 


2ZN-1  "  ZN-2 


[N]” 


Following  the  same  procedure  as  the  preceding,  we  have  in  this  case  the 
two  equations  in  the  middle,  which  we  solve  for  zr  and  zc+i>  Then  the 
rest  of  the  unknowns  can  be  successively  determined. 
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The  result  Is  shown  in  Appendix  A  In  terns  of  the  matrix  a*..  The  cases 
N  -  3,  4  and  5  are  also  listed  in  Appendix  A.  The  reason  for  the  restriction 
r  >  3  is  exhibited  in  the  Appendix. 

III.  BASIS  FUNCTIONS.  We  denote  the  cubic  spline  in  an  interval 
[*i,  »i+i  1  by  yCxelXi,  x^]).  Given  yt,  yi+1 ,  z±,  z1+1,  it  is  formulated 
(Ref.  4), 


Xi+1  ”  X  X  ~  Xi 

ytxSIXj,  x1+ll)  -  ?1  - £ - y1+1  — 5- 


(Ax)2  fS 
-  — —  z±  {- 

_iAx)i2 

6  Zi+1 


-  x 


1 

r — \ 

h 

> — / 

1 

:  -  x.  x  -  x.  3 

*.  -  (  ix  ln 

for  xfelx1  , 

■  0  . 

for  xd  [x  , 

(5) 


i+1- 


xi+i1 


A  pair  of  brackets  enclosing  two  points,  like  those  at  right  above,  mean  a 
closed  interval  spanned  by  the  points. 

The  cubic  spline  in  the  whole  domain  [xq  ,  xjj]  is  given  by 


N-l 


y(xer[xQ,  xnD  -  £  y(xet»i,  xi+1D  . 


(6) 


Define 


pJ(x€[x1,  x1+1J)  -  -  I*  [-^ -  -  (-^ - )]  - 


Vl  '  X 


-i  [■ 


x  -  x 


Ax 


x  “  x. 


0  , 


1+1  -  ( - ST)  1  al+i  *  for  x&[xi’x1+i] 

for  x^xi»xi+1l 


where 


j£{0,...,  N}  , 
Then  (6)  may  become 


iS{0,....,  N-l} 


I  I'i  I'l  <1.  |L‘ 


L‘|l. 


*'*l 


I 


S: 

is 

s 


v! 


8 


5 


N-l  x  -  x  x  -  x 

y(*6  [«0,  «„))  -  {y,  — ts—  +  y1+1 


L*  I  y,  pJ(»)| 

3-0  ,Jtl] 


The  first  derivatives  of  pJ(x)  are  in  general  discontinuous  at  the  mesh 
points.  We  rewrite  the  above  to 


i 

y(xfe[xQ,  x  ])  -  l  y  B  (x) 
i«0 


by  defining  the  basis  functions 


“1(*>  •  „  x,>  *  (H^Wx,,  X..,) 


i-r  i' 


i’  i+1 ‘ 


*  Jo  pl(xSlV  Vi« 


with  a  convention  that 


x  -  x 


^  Ax  ^x£(x_j,  Xq)  ^ 

and  (10) 

fitLli  .  o 

v  Ax  jx6(xn,  xn+1) 

and  another  that  the  value  at  x  -  x^  should  be  found  as  the  limit 
x  ♦  Xj.  -  0  or  x  ♦  xj_  +  0,  where  a  pair  of  parentheses  enclosing  two  points 
mean  an  open  interval  spanned  by  them.  The  basis  function  satisfies 

Bi(Xj)  -  ,  (11) 

where  5*  is  Kronecker's  S  . 

IV .  PROBLEM.  We  consider  freezing  of  water  in  a  finite  domain  0  <  X  £ 

l.  The  boundary  temperatures  TA  at  X  ■  0  and  Tg  at  X  -  l  are  constant, 

the  latter  being  also  the  initial  temperature.  They  satisfy  the  condition, 

T  <  T  <  T  ,  where  T  is  the  phase  change  temperature. 

A  r  B  F 

At  t  ■  0,  ice  emerges  at  X  ■  0,  whose  temperature  we  express  by 
T*(X,  xjt)  ,  where  ie j  is  the  thermal  diffusivity  of  the  ice.  The  phase  front 
is  denoted  by  s(t).  The  temperature  of  the  water  is  expressed  by  Tw(X,*fyt). 


K 
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We  introduce  three  nondimensional  coordinates  and  one  nondimensional 
constant, 

x  -  X/t,  T  -  tc  t / 1? ,  5  -  s(t)/£,  6  -  tc  / ie  .  (12) 

I  W  I 

Then  the  heat  equations  may  become: 

d£  _  iiii 

*«  dT  3x2  ’ 


If  §  -  -  0  . 

3C  dT  3x2 

They  are  subject  to  the  conditions: 


T^O) 


(14.1) 


TW(1) 


(14.2) 


TX(5 ) 


TW(5) 


(14.3) 


(14.4) 


TW(x,0) 


(14.5) 


&  ■  (cx/L)  <!H  '  <*VL)  (!H  • 


(14.6) 


where  Cx  and  Cy  are  heat  contents  of  ice  and  water,  respectively,  and  L 
is  the  latent  heat. 

V.  TEMPERATURES.  Choosing  the  phase  front  as  a  mesh  point,  we  insert 
equally  spaced  n  and  a  Internal  mesh  points  in  the  ice  and  water  domains, 
creating  N  -  n  +  1  and  M  *  m  +  1  subintervals,  respectively.  Substituting  £ 
for  the  time  coordinate  t,  we  express  the  unknown  temperatures  at  the 

internal  mesh  points  by  Tx(5),  where  indexes  i  »  1 .  n  are  for  the  ice 

and  Indexes  i  •  n+1,...,  n+m  for  the  water.  Then  we  have  &x  *  C/N  and 
(l-€)/M  in  the  ice  and  water  domains,  respectively. 


»v  *’ 


respectively,  and  using  the  equations  in  (16) 


VI.  FINITE-ELEMENT  EQUATIONS.  We  compute  the  integrals. 


rii!  «  -  £ili 


/  57  “  B^(x)dx  -  0  ,  J6{1,  *•  ,  n}  , 


3x* 


(18.1) 


and 


I  fii!  si . .  »j 
{ ts*  *  8  ** 


/  (f~-  57  -  8  ^(x)dx  -  0  ,  jg{n+l,  ..  ,  n+m}  .  (18.2) 


On  the  condition  that  the  temperature  distributions  at  the  time  substitute  £ 
are  known,  we  rewrite  the  result  of  the  integrations  to  the  difference 

equations  at  the  time  substitute  5  +  y  A£  .  Letting 


yk  “  1^(5  +  AC)  for  k  -  l,...,  n+m  , 
we  find  that  the  difference  equations  are  quadratic. 


m+n  m+n 


l  l  7k  7h  +  l  Bj  7k  +  Y. 
*1  h-1  J  K  n  k-1  J  K  •  J 


m+n 


(19) 


where  j  £  { 1 , . . . ,  n+m}.  It  Is  noted  that  oj  is  not  symmetric  with  regard  to 
k  and  h.  The  process  of  obtaining  these  coefficients  is  shown  in  the  next 
section. 


(v) 


To  find  the  solution  for  y^,  let  y^  be  the  vth  approximation.  Then, 
applying  Newton's  method,  the  (v+l)th  approximation  is  found  by  solving  a  set 
of  simultaneous  linear  equations, 


m+n 


h-1 


k  (v+1) 
cj  yk 


(20) 


where 


$  •  X  +  *r>4v)  * 


(21) 


,  nr+n 

1  I 

k-1 


nr+n 

I 

h-1 


kh 


hkWv) 

aj  K 


4v) 


-  V* 


The  time  Interval  At  corresponding  to  the  time-substitute  Interval  AC  can  be 
found  by  use  of  (17). 

VII .  COEFFICIENTS ,  Integrations  of  the  equations  in  (18)  become 
simpler  if  the  delta  and  epsilon  notations  introduced  in  the  following  by 
(22)  and  (31),  respectively,  are  employed. 

By  use  of  the  delta  notations,  defined  by 


d(k;  i-1,  i)  -  1  , 

if 

k  -  i-1  , 

(22.1) 

-  "I  . 

if 

k  -  i  , 

(22.2) 

-  o  , 

if 

k^{i-l,  i}  , 

(22.3) 

where  k£r{0, . . . ,N-l}  ,  we  give  a  unified  expression  to  the  derivatives  of  the 
basis  functions , 


g-  (x£(V  xfc+1))  -  ^  «(k;  i-1,  i)  +  (x^(xk,  x^))  . 


Because  k€;{0,...,  N-l}  ,  (22.1)  and  (22.2)  are  not  applicable  if  i  * 
N,  respectively. 

The  values  at  the  mesh  points  x, +0  and  x,  , . -0  are: 


(23) 

0  and 


3T  (I<S(V  Vl»|: 


+0 


Ax  ^k;  1-1 »  ^  "  6  ^2ak  +  2k+l^ 


and 


dx  (X<E(V  *k+l^ 


Vl  ‘  ° 

~  f A(k+1;  i,  i+1)  +  |  (a£  +  2aJ+1)}  . 


(24) 


(25) 
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Including  the  terminal  temperatures,  we  rewrite  (15.1)  to 


i 


1:3 


:! 


Hi 

ft] 

,Hj 


I 


a 


1 


*1 


gs 

£ 

1 


1 

% 

3 


-a 


T^x.S)  -  l  B1(x)  tJ(5)  , 
1-0 


where  we  do  not  attach  subscript  I  to  Bi(x)  for  simplicity. 


Because 


?  ■  0  , 
1-0“  *k 


where  1£{0,..,  n}  and  jg{l,..,  N-l}  ,  the  partial  integration  of  (18.1) 
yields 


N  dT?  N-l  \+l 


1  IT  I  /  Bi(x)  Bj(x) 


dt  d5  *  1 

i-0  k-0  x^ 


dx  + 


N  _  N-l  *k+l  -  </  x 

♦  l  I?<5)  l  I  d*  -  0  . 

1-0  1  k-0  ^  “  111 


To  prove  (27),  use  (11),  (A. 2),  (A. 3),  and  (A.4),  the  latter  three  of  which 
are  in  the  Appendix. 


Use  of  (23)  yields 
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where  it  is  considered  that 
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Letting  X  -  (x-x^)/Ax  ,  the  second  summand  in  (29)  is  integrated  to 


35^  1  [ C2-61  +  312)^  3-  (l-3X2)a^+1]  [<2-6X  +  2i2}a^  +  (l-3X2)aJ+l ]dX 


slta  (f  <4  4  +  4+l4+l)  *  TO  (4  4+1  +  4+1  4>t  • 


When  (29)  is  substituted  in  (28),  the  first  summand  in  (29),  i.e.  the 
product  of  the  delta  notations,  produces 


N  N-l 

F  *  7-  I  T7(C)  l  5(k;  i-1,  i)  «(k;  j-1,  j)  , 
Ax  i-0  1  k-0 


where  j£{l,...,  N-l}  .  Considering  that  the  second  multiplicand  delta 
notation  in  the  above  is  nonzero  only  for  k  *  j  -1  and  ■  j  ,  we  get 


F  -77  1  xf(C)  i-i,  1)  -  «(j;  i-i,  O]  , 

ax  i-o  1 


which  becomes 

r  •  h  I*  t}-1<5>  +  2TJ(E>  '  ^l'5’1  • 

Thus  the  second  summand  in  (28)  is  evaluated. 
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i-0  k*0  x^ 
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where 


,  N-l 

—L_  y  ri 

36.21  * 

k*0 
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Evaluated  at  i  -  j-1,  j,  j+1,  and  l^{j-l,  j,  3+1 }  ,  Q^’1  yields 


J  ,1-3-1 


i  -  35o  t(7aj-2  +  16aj-i  +  7aj}  +  (7aj’i  +  16aj_1  +  7aj^)]  + 


+  36.21  U  (ak  4  +  4+1  4+i^  +  20  ^4  4+1  +  *4+1  4^  ’  (33.1) 
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Thus  (18.1)  becomes 
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where 


Ax  -  C/N  . 


Taking  the  difference  and  mean  at  5  +  j  ^5  ,  and  using  the  temperature 
notations  defined  in  Sections  IV  and  V,  i.e.,  letting 


S  yi :  Ti 

dC  "  AC 


and  changing  T^(C)  to 


Ti<C  +  7  AC)  -  {  (7l  +  T1) 


(18.1)  becomes  finally 
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where  j£{l,...,  n}  .  and  aP  are  used  instead  of  Ai^  and  Aj^,  respec¬ 

tively,  to  avoid  the  possible  confusion  in  the  Ice  domain. 


Following  the  similar  procedure,  (18.2)  yields 
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Changing  dC/dr  to  A£/At,  and  taking  the  difference  and  mean  at 


where  j£{n+l,...,  n+m} .  A*  and  are  used  to  avoid  possible  confusion  in 
the  water  domain. 


C  +  AC  ,  we  rewrite  (17)  to 
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(36) 
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where 


-  (Cj/L)  |  (2aJ  +  a*)  for  lt<=(l . o-l}  , 


,n -  cvuf  + v,)]  . 
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Substituting  (36)  into  (34)  and  (35),  we  find  (19),  whose  coefficients 
are  shown  below.  Derived  from  (34),  the  coefficients  for  j€{l,...,  n}  are 
as  follows, 

°j  “  (jj)  9  1 » •  •  •  *  n+m} ,  h£! (l,...,  n}  . 
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n}  be  defined  by 


Let  A*  for  kfe{l . 


- k d)2i- l  \ aib  * *  »i-ki  * 7i  <4-k  • 


k 

for 

k£{  1 .  j-2} 

j-1  _  I 
j  2 

for 

k  -  j-1 

3  +  i 
j 

for 

k  -  j 

J+l  -  I 

j  2 

for 

k  -  j  +  1 

k 

j 

for 

k£{j+2,...,  n} 

-  -  55  (§)  Pk  l  Th  °i,h  for  k6{Q+1> .  n+™} 

h«l 


Let  Uj  be  defined  by 
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Derived  from  (35),  the  coefficients  for  jfifa+l,  n+m}  are  as 
follows: 
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APPENDIX  A.  Matrix  a  In  (4) 


Caaa  N  -  3: 


For  N  ^  6,  given  r  -  [N/2],  every  entry  aj  can  be  found  by  dividing  the 
coefficient  of  y^  on  the  right-hand  side  of  an  appropriate  formula  in  the 
following  by  the  factor  in  front  of  the  summation  symbol  on  the  left-hand 
side,  where  1  and  j  are  column  and  row  numbers  of  the  matrix,  respectively. 
The  order  below  must  be  strictly  followed  in  the  computation.  It  is 
demonstrated  below  that  the  least  number  of  r  is  3. 


Case  N  “  2r,  where  r  >  3 


Row  r: 


i  »  v 

i  (2f  -  f  , )  l  tr  j 
5  r  r-1  q  r  'k 

-  -  <-l)r£i  70  +  (“l)r(2f i  +  f2)7i  -  (-Dr(fi  +  2f2  +  f3)72  - 

-  6  T(-t)r'k  fk  7k  -  2CVl  +  'r»r  - 

r-l  , 

~  6  l  (-1)^  ^  7l|_k  -  C-Dr(fi  +  2f2  +  f3)7N_2  + 

+  (*l)r(2fl  +  £2)7^  -  (-l)rft  7n  , 

r-l  .  r-l  . 

where,  if  r  -  3,  suxmations  6  £  (-l)1"  f.  j.  and  6  \  (-1)*”  r.  y 

oust  be  skipped.  3  3 

Row  r-l,  r-2 . 2: 

Letting  i  »  r-l,  r-2,...,  2  successively  in  the  foilwing,  entries 
a^  in  row  1  can  be  found: 


N  N 

fi+l  ^  ai  yk  "  "  fi  ^  a  1+1  7k  "  6(-1)  fi  70  + 

+  6(-l)i(2f1  +  f2)7l  -  6<-l)i(f1  +  2f2  +  f3)y2  - 
1-1  «  < 

-  36  l  (-1  r^f.,  7j  -  6(f±_1  +  2fi)7i  + 


+6fi7i+l  * 


i-l 


where  6(-l)*(f ^  +  2f3  +  ^3^2  au8t  k®  s^PP^d  for  i  ■  2,  and  36  £  (-1)* 
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The  reac  of  Che  elenencs  can  be  found  by  using  the  cencrosymmetrlc  relation. 
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where  k,  i  -  0,...,  N 


Case  N  "  2r+l,  where  r  ^  3. 
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r-l  N-3 

where,  if  r  -  3,  summations  6  I  (-D^f^  y ^  and  6  £  (+l)^fN_fc  y^  oust  be 

skipped.  ^  1+2 

Row  r-l ,  r— 2 , . . . ,  2 : 

Letting  i  ■  r-l,  r-2,...,  2  successively  in  the  following,  elements  a.± 
in  row  i  can  be  found. 


fi+l  |  *1  yk 


-  -  I  »J+l  -  St-1)1'!  y0  +  6(-ul  <2£1  +  f2)yl  ' 
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.  i~l 

where  6(-l)1(fj  +  2f2  +  ^3^2  0,1186  **  skiPPed  for  1  "  2»  «nd  36  I  (-1)1  Jfj 

for  i  «  2  and  3  .  ^ 


Row  1: 
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*  7n  -  27,  +  7o 
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The  rest  of  the  elements  can  be  found  by  use  of  the  centrosymmetrlc 
relations  (A.l).  The  validity  of  the  centrosymmetrlc  relation  is  obvious  in 
cases  N  ■  3,  4  and  5.  For  the  cases  N  _>  6,  it  can  be  demonstrated  by  actual¬ 
ly  writing  the  expressions,  which  is  avoided  in  this  presentation. 


Stipulated  by  the  equations  [1],  [2],...,  [N],  elements  aj.  satisfy  the 
relations  shown  below.  Substituting  (1)  and  (4)  into  the  equations,  equating 
the  coefficients  of  y^,  i  *  0,...,  N,  and  eliminating  the  duplicated 
relations  by  use  of  the  centrosymmetrlc  relation  (A.l),  three  groups  are 
found. 

Group  1:  Equations  [0]  and  [N]  yield 


V  ljp  Ir 

l0  "  2al  +  a2  “  0 


(A. 2) 


where  kfe{0,.. ..,  N}  . 

Group  2:  For  k^{i-l,  i,  i+l} ,  equations  [1],.. 
relations, 


,  [N— 1 ]  yield  three 


i-1  .  ,  i-1  .  i-1 
ai-i  +  4ai  +  ai+i 


■  6 


1  ,  >  i  .  i 

ai-i  *  S  +  ai-i 


-  -  12 


(A. 3) 


and 
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Group  3:  For  k  {o,...,  l-2}[j{i+2, . . . ,  n}  ,  equations  [1 [N-l ] 


k  .  k  k  n 
*1-1  *  4*i  +  *iti  -  0  • 


(A.4) 


if; 


Conventional  set-theoretic  notations  (Ref.  2,3)  are  employed  In  the 
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ABSTRACT 


The  use  of  the  superconvergence  phenomenon  for  the  retrieved  gradients 
of  piecewise  linear  approximations  on  triangular  meshes  to  the  solutions  of 
problems  of  planar  linear  elasticity  is  discussed.  In  particular  results 
for  problems  involving  singularities  are  presented,  with  particular  reference 
to  the  apparent  shortcomings  for  the  case  of  linear  elastic  fracture. 


I.  INTRODUCTION 


Finite  element  methods  are  now  used  routinely  for  problems  of  linear 
elastic  fracture,  see  e.g.  Owen  and  Fawkes  [5],  and  theoretical  error  estimates 
for  finite  element  approximations  to  stress  intensity  factors  have  been  de¬ 
rived,  see  e.g.  Dest'uynder,  Djaoua  and  Lescure  [1].  However,  there  often 
remains  the  hope  with  finite  element  methods  that  further  research  will  pro¬ 
duce  better  error  estimates  and  improved  rates  of  convergence.  To  this  end 
we  consider  here  the  phenomenon  of  superconvergence  in  finite  element  methods 
and  its  possible  use  in  the  treatment  of  linear  elastic  fracture. 


The  phenomenon  of  superconvergence  in  finite  element  methods,  whereby 
the  rate  of  convergence  with  decreasing  mesh  size  of  the  finite  element  approx¬ 
imation  to  the  true  solution  of  the  problem  is  at  certain  points  of  the 
problem  domain  superior  to  that  found  globally,  is  now  well  known  and  has 
been  extensively  researched;  see  e.g.  the  review  paper  of  Krizek  and 
Neittaanmaki  [4]  which  contains  two  hundred  references.  However,  although 
the  superconvergence  effect  has  been  extensively  exploited  by  engineers  in 
stress  analysis,  and  even  in  linear  elastic  fracture  by  the  use  of  contour 
integrals  using  calculated  stress  values  at  Gauss  points  with  quadrilateral 
elements,  it  is  true  to  say  that  the  mathematical  analysis  of  superconvergence 
has  lagged  behind  the  practice;  largely  on  account  of  the  high  levels  of 
regularity  of  the  problem  solutions  required  to  produce  meaningful  super- 
convergent  error  estimates.  These  have  effectively  precluded  the  analysis 
of  methods  for  realistic  problems.  This  situation  motivated  the  work  of 
Wheeler  and  Whiteman  [8],  who  derived  superconvergent  estimates  for  recovered 


gradients  on  subdomains  from  piecewise  linear  finite  element  approximations 
to  the  solutions  of  two-dimensional  Poisson  problems.  This  work  was  extended 
to  problems  of  planar  linear  elasticity  by  Whiteman  and  Goodsell  [9].  The 
result  is  that,  for  problems  of  the  above  types  involving  boundary  singular¬ 
ities,  superconvergence  estimates  are  available  for  the  approximations  to 
the  gradients  of  primary  variables  on  subdomains. 

However,  for  linear  elastic  fracture  the  main  quantity  of  interest  is 
the  stress  intensity  factor  and  its  approximation.  In  this  paper  we  consider 
the  finite  element  approximation  of  the  stress  intensity  factor  for  a  simple 
Mode  I  problem,  through  the  use  of  the  J-integral  of  Eshelbv  [2]  and  Rice  [6]. 
Theoretical  error  estimates  for  methods  involving  recovered  gradients  are 
presented,  which  have  lower  rates  of  convergence  than  those  of  [1],  although 
the  current  approximations  appear  numerically  to  have  the  same  rate  of  con¬ 
vergence.  However,  we  feel  it  worth-while  to  demonstrate  these  present 
limitations,  particularly  as,  for  methods  involving  recovered  gradients, 
one  might  expect  that  both  the  theoretical  and  numerical  rates  of  convergence 
would  be  better  than  those  obtained  in  the  standard  manner. 

II.  LINEAR  ELASTICITY  AND  LINEAR  ELASTIC  FRACTURE 

II. 1  Linear  Elastic  Problem  and  Weak  Formulation 

The  linear  elastic  problem  is  defined  in  the  region  ft  c  ]R2  with  polygonal 
boundary  3ft  =  3ftc  U  3ft^,.  The  displacement  u^x_)  =  (u^Uj)1”  at  any  point 
x  =  (x1,x2)  G  ft  satisfies  the  Lame  equation 


-  pAu  -  (X+p) grad  div  u  =  f  in  ft  , 


(2.1) 


and  on  3ft  the  boundary  conditions 


u  =  0  on  3ft  , 
—  —  c 


l  0--(u)n.  =8:  on  3^T  »  1  S  i  S  2 

3=1  J  J 


(2.2) 

(2.3) 


where  f_  are  given  body  forces,  £  are  boundary  traction:-  and  X  and  p, 

X,p  >  0,  are  the  Lame  constants  of  the  material. 

The  admissible  displacement  vectors  arb  v  =  (v, .v^)  £  (H  (ft))  and 

we  define 
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(v  :  v  e  (Hl(fi))2  ,  V. I  =  0  ,  i  =  1 ,2} 
(  1  SO  J 


The  weak  form  of  problem  (2.1)  -  (2.3)  is 


(2.4) 


k 


‘•"i' 

A1 

*« 


find  u  €  V  3  a(u,v)  =  F(v)  V  v  €  V  , 


(2.5) 


in  which 


f  l  I 

a(ii,v)  =  ^Xdivjjdivv+  2y  \  e  . .  (u)  e  .  .  (v)  >dx  , 

i ,  j  =1  1J  1J  J 


ft  V 

F(v)  =  £  .  v; %  +  l  g.v.ds  • 

Jn  J3nm  i=i  1  1 


(2.6) 


(2.7) 


For  linear  elastic  fracture  we  limit  the  consideration  here  to  a  Mode  1 
plane  stress  problem  with  symmetric  loading  as  in  Fig.  1.  This  problem  is 
of  type  (2.1) -(2.3)  and  we  note  that  the  faces  of  the  crack  are  stress  free. 

t  t  t 


Fig.  1 


The  near-tip  crack  displacement  field  has  the  form,  see  [6], 
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- 2p/2m  q  .  n  . 


(2.8) 
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k  +  1  -  2cos' 
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where  the  constant  is  the  (Mode  I)  stress  intensity  factor  which  has  to 
be  determined,  and  for  plane  stress  <  =  (3-v)/(1+v).  The  stress  intensity 
factor  is  important  as  it  is  a  fracture  criterion. 
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One  way  of  calculating  for  this  problem  is  to  use  the  J-integral, 
see  Eshelby  [2]  and  Rice  [6],  defined,  see  Fig.  1,  as 

j  -  - 1  h  Sr  J*j  •  <2-9) 

where  r  c  ft  is  any  closed  curve  joining  the  lower  and  upper  faces  of  the 
crack.  W  =  io..e..  is  the  strain  energy  density  and  T.  =  a.-n.,  n  being 
the  outward  normal  unit  vector  to  r.  For  the  plane  stress  problem  of  Fig.  1, 

J  =  K*/E  ,  (2.10) 

where  E  is  the  Young's  modulus  of  the  linear  elastic  material  of  the 
problem  (2.1) -(2.3). 

II..  2  Finite  Element  Method  and  Recovered  Gradients 

We  adopt  the  same  notation  and  assumptions  as  were  used  by  Whiteman 
and  Goodsell  in  [9].  The  region  ft  is  assumed  to  have  subdomains  ftQ  ,S21  ,ft 
such  that  ftQ  cc  ftx  cc  fi2  c  ft  and  is  partitioned  into  triangular  elements. 

The  regions  ftfl  and  ft2  are  assumed  to  be  rectangular  and  such  that  each 
is  the  union  of  a  finite  number  of  squares  of  side  h.  Each  square  is 
subdivided  into  two  triangles  using  the  diagonal  of  positive  slope,  so 
that  ftQ  and  ft2  are  each  meshed  completely  with  uniform  isosceles  right 
angled  triangles.  The  mesh  in  the  remainder  of  ft  -  ft2  is  compatible  with 
that  of  ft2  and  consists  of  triangles  of  general  shape. 

A  finite  dimensional  subspace  S*1  c  V  consisting  of  continuous  piece- 
wise  linear  functions  is  defined  over  the  partition  of  ft  and  the  finite 

element  method  is  applied  to  (2.5)  with  trial  function  u,  and  test  functions 

h  ** 

v^  from  S  .  For  v^  €  S  we  define  the  recovered  mid-point  gradient 

-  lW]Tk  +  l-^h]T’}  *  (2>11) 

for  k  the  mid-points  of  element  sides  in  ftQ  and  and  any  pair  of 
adjacent  elements  in  ft0 .  For  element  side  raid-points  M  on  3ft0  the  recovered 
gradient  is  defined  as 


WM))  =  l  qi[^h]Ti  * 

1  Tk 


(2.12) 


where  the  q^  are  simple  numerical  coefficients  and  the  summation  is  over 


a  small  number  of  triangles  involving  and  near  to  the  point  M,  see  [9]. 


For  any  element  of  ft0  we  take  the  linear  interpolant  to  the  recovered 

gradients  of  v^  at  the  three  side  midpoints.  Over  the  whole  of  ftQ  these 

linear  interpolants  form  the  discontinuous  piecewise  linear  interpolant 

V*v,  to  the  recovered  gradients  of  v,  .  We  define  the  seminorm 
—  — ti  — n 

*  I 

u  -  v.  =  Vu  -  V*v,  .  (2 . 1 : 

-  ,  n  i - - h  .  n 


(2.13) 


It  has  been  shown  by  Whiteman  and  Goodsell  [9]  that,  if  u  6  V  fl  (H3(ft2))2 
is  the  solution  of  problem  (2.5)  and  v^  £  S^,  then 


-  c{  ki  "  v.  +  h2  u  } 

i,n.  u  i,n0  j,n.J 


(2.14) 


where  u^  €  Sn  is  the  piecewise  linear  interpolant  to  _u  at  the  element 
vertices . 

* 

In  order  to  obtain  an  estimate  for  -  iv  ,  it  is  thus  necessary 

X,fl0 

to  bound  u_  -  u,  in  (2.14).  For  the  problem  (2.5)  with  the  configur- 

*  "  .  r\ 


'  w 

ation  as  in  Fig.  1  it  has  been  shown  in  [9]  that 

2.1  'Sh!  „  S  C{h|-I  'Sh  ,  o  *  l-I*^  , 

x  ,ftQ  L  1  »ft2  0 

i  c(h1_2e  u  +  h2  u 

*•  2,4/3-e,ft  -  O  J 


+  h  u 


+  h  u 


(2.15) 


where  e  >  0  is  an  arbitrary  constant  and  u  £  (W2  (ft))2.  Combining  in- 

4/ 

equalities  (2.14)  and  (2.15)  we  have  that 


u  -  u. 

*  *cj 

,  i-2e 
n 

u 

—  -h 

l.a. 

:,4/3-E,ft 


u 

} 

3.ft,J 

(2.16) 


showing  the  OCh1  2E)  convergence  on  ft^  of  the  recovered  gradient  function 

V*u,  to  Vu. 

—  — h  — 


III.  PATH  INTEGRAL  ERROR  ESTIMATES 

For  the  case  just  considered  where  problem  (2.5)  is  defined  as  in 
Fig.  1  and  thus  contains  a  boundary  singularity  due  to  the  crack  and  the 
boundary  conditions,  the  error  estimate  (2.16)  holds  only  in  an  interior 
subdomain  ftQ ,  because  it  demands  that  the  solution  ij  £  (H3(ft2))2  where 
ft2  c  ft.  The  J-integral  (2.9)  is  defined  over  the  path  f,  which  joins  the 


WTiKiorwiiMTonuni  ft.  jtvj 


lower  to  the  upper  face  of  the  crack  as  in  Fig.  1  and  thus  contains  points 
of  This  effectively  precludes  the  use  of  estimates  of  the  type  (2.16) 

for  the  estimation  of  errors  in  calculated  approximations  J*  to  J.  However, 
we  believe  that  it  is  of  interest  to  estimate  errors  in  path  integrals  of 
this  type  over  interior  contours,  and  this  we  shall  now  do  for  a  represent¬ 
ative  case. 


Fig.  2 


We  consider  the  fixed  contour  T  c  which  for  simplicity  we  take  as 


a  straight  line  parallel  to  the  xx-axis  through  certain  element  side  mid- 
points  of  as  in  Fig.  2.  Along  the  contour  T  the  recovered  gradient 
function  7*v  ,  c:  S,  is  continuous  and  piecewise  linear. 


-  -4i*  -4i  -  h 
It  has  been  shown  by  Goodsell  and  Whiteman  [3],  that 
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this  being  an  estimate  for  the  H  error  seminorm  on  the  contour  T. 
The  J-integral  (2.9)  contains  terms  of  the  type 

JL  .  =  f  (Vu) . (Vu) .dx  , 

ij  L  —  1  —  j  1 

r 

which  we  approximate  with  terms  of  the  type 

J*.  H  [  (Vu.  )  .(Vu,  )  .dx.  . 
ij  J  — h  l  --h  j  1 

T 


(3.1) 


(3.2) 


(3.3) 
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We  are  now  able  to  bound  J..  -  J; .  ,  and  the  estimate  and  the  proof  are 

I  xj  ij|’ 

given  in  the  following  theorem. 

Theorem  If  \a  €  V  f!  (H3(ft2))2  is  the  solution  of  problem  (2.5)  defined 
in  the  region  of  Fig.  1  and  and  are  defined  respectively  as  in 
(3.2)  and  (3.3),  then 


<  cjlvul  [hi_2£  u 

1J  1J  It — *0  ,?!■ 


2  ,4/ 3-e,n 


+  h  *  u 


,  l-4£  4 

+  h  u 


2  2  1 

+  h3  u  >  . 

2,4'3-e,Q  3,^2J 


(3. A) 


Proof  From  (3.2)  and  (3.3)  we  have  that 


3ij  -^il  ■  |jT{cVi<iVj  - 

s  {jiVn'iVj -  <'i>i}d*i 

£  V*u  (V*u,  -  Vu)  dx,  +  [  !vu  (V*u  -Vu)Jdx 
j, h h l  j^,| — h  —  |  3 
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2  Vu  +  V*u,  -  Vu  m  V*u,  "  Vu  >dx. 
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Vu  dx. 


=  2  Vu  u  -  u,  +  !  u  -  u,  , 

'  l0,f  1“  -h«i,r  i_  ^!i,r 

so  that  result  (3.4)  follows  immediately  using  (3.1), 
If  we  now  define 

< — i  ^  ~ i 

K.  .  =  J. .  ,  K . .  =  J. . 

ij  1J  xj  xj 


then,  using  (3.4),  it  follows  that 

I  ~  I  I  ~  I  ^ 
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The  result  (3.5)  is  of  course  not  an  estimate  for  the  error  in  the 
approximation  K*  to  K^,  which  would  be  derived  first  by  using  gradients 
recovered  from  the  finite  element  approximation  u^  at  element  side  mid¬ 
points  in  the  computation  of  J*  approximating  the  J  of  (2.9),  and  then 
applying  (2.10). 

Further,  even  if  a  result  of  the  type  (3.5)  were  true  for  K*,  it  does 
not  have  as  good  a  rate  of  convergence  as  that  derived  in  [ 1 ] ,  where  no 
recovery  of  gradients  is  employed.  However,  numerical  evidence,  see  [3], 
indicates  that  in  fact  the  convergence  of  to  is  0(h),  thus 
suggesting  that  if  the  method  were  applied  to  obtain  K*  the  convergence 
would  be  0(h).  The  disappointing  fact  that  the  rate  of  convergence  would 
not  even  then  be  better  than  0(h)  is  due  to  the  lack  of  smoothness  of  the 
solution  of  the  fracture  problem,  see  [9]. 
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SOME  ISSUES  OF  NUMERICAL  INTEGRATION  AND  PENALTY  RELAXATION 
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ABSTRACT .  A  simple  two-node  axisymmetric  shell  element  with  the 
shallowly  curved  meridian,  shear  deformation,  and  rotary  inertia  is 
developed.  The  major  aspects  include:  (a)  anisoparametric  interpolat¬ 
ions  of  the  displacement  variables  to  design  out  excessive  stiffening 
due  to  membrane  and  shear  'locking';  (b)  consistent  shear  relaxation  to 
further  upgrade  the  element  strain  energy;  (c)  low-order  quadrature 
evaluation.  The  resulting  element  possesses  an  improved  condition  of 
the  stiffness  matrix,  increased  efficiency  in  explicit  time  integration, 
and  enhanced  accuracy  in  coarse  discretizations.  Comprehensive  vibrat¬ 
ion  examples  are  carried  out  to  assess  the  element  performance.  The 
numerical  results  demonstrate  a  wide  applicability  range  with  respect  to 
element  slenderness  and  curvature  properties. 

I .  INTRODUCTION .  Shear -deformable  curved  beam  and  shell  finite 
elements  formulated  by  the  displacement  approach  present  a  number  of 
conceptual  difficulties  [1,2].  The  major  issue  is  that  of  properly 
approximating  so-called  penalty  strains.  These  are  membrane  strains 
that,  due  to  initial  geometry  curvatures,  couple  membrane  and  bending 
deformations;  and  transverse  shear  strains  which  couple  the  transverse 
displacement  and  normal  rotation  kinematic  variables.  The  computational 
difficulties  arise  when  the  element  geometry  is  very  thin,  in  which  case 
the  states  of  inextensional  and  shear less  deformations  are  enforced  by 
the  presence  of  large  multipliers  (or  penalty  parameters)  of  the  penalty 
strain  energies.  The  enforcement  of  these  deformation  states  at  the 
element  level  implies  that  each  polynomial  coefficient  of  the  penalty 
strain  vanishes  in  the  limit  as  the  element  becomes  infinitely  thin. 
The  resulting  constraint  equations  (known  as  penalty  modes)  are  either 
properly  coupled  (involving  contributions  from  all  kinematic  variables 
of  the  penalty  strain)  or  spuriously  uncoupled  (having  degrees  of  free¬ 
dom  (d.o.f.)  from  a  single  kinematic  variable).  It  is  the  latter  type 
of  penalty  modes  that  produces  either  nearly  vanishing  kinematic  re¬ 
sponse  (the  phenomenon  known  as  'locking')  or  yields  excessively  stiff 
solutions.  Thus,  having  properly  coupled  penalty  strains  (in  all  modes) 
is  paramount  in  achieving  practical  convergence  in  the  thin  regime. 

Although  requiring  properly  coupled  penalty  strains  is  necessary, 
it  is  often  not  sufficient  to  ensure  adequate  thin-regime  behavior.  For 
instance,  reduced  integration  and  an  analogous  (and  often  equivalent) 
"field-consistent"  approach  [21,22],  which  produce  properly  coupled 
penalty  strains,  have  been  shown  [11]  to  yield  inconsistent  force  vec¬ 
tors  (due  to  distributed  loads)  and  mass  matrices;  thereby,  producing 
dramatically  inferior  results  in  problems  involving  higher  vibrational 


modes  or  distributed  loading.  Moreover,  simple  (low-order)  elements 
often  experience  undesirable  overconstraining  at  coarse  discretization 
levels,  and  lock  severely  in  cases  of  overly  restrained  boundaries 
[3,4]. 


To  avoid  locking  ane^/or  excessive  constraining  entirely,  'relaxa¬ 
tion'  of  element  penalty  constraints  proved  effective.  The  concept  of 
relaxing  shear  constraints,  advocated  by  Fried  [5]  and  MacNeal  [6] 
(though  interpreted  somewhat  differently),  introduces  an  element  relaxa¬ 
tion  parameter  (correction  factor  [7])  in  the  shear  stress-resultant(s), 
hence  appearing  as  a  multiplier  to  the  penalty  parameter.  As  the  ele¬ 
ment  approaches  its  thir.  limit,  the  relaxation  parameter  diminishes, 
reducing  the  penalty  value.  The  remarkable  aspect  of  this  approach  is 
that  at  the  global  (who.ie  discretization)  level  the  penalty  constraints 
are  enforced  in  a  superior  fashion  [2].  Furthermore,  the  penalty  relax¬ 
ation  provides  practical  benefits  such  as  enhanced  accuracy  in  coarsely 
discretized  models,  a  well-conditioned  stiffness  matrix  over  the  whole 
range  of  the  element  slenderness,  and  a  reduced  value  of  the  highest 
element  frequency.  The  latter  aspect  allows  larger  time  steps  in  the 
expxicit  time  integration  procedures. 


In  this  paper  we  will  develop  a  simple  yet  extremely  effective 
shallowly-curved,  axisymmetric,  displacement-type  shell  element  in  which 
the  effects  of  shear  deformation  and,  in  dynamics,  rotary  inertia  are 
ncluded.  The  element  is  an  extension  of  the  conical  shell  proposed  in 
[8].  This  effort  will  lay  the  groundwork  for  a  three-dimensional  shal¬ 
low  shell  model. 


The  axisymmetric  shell  is  an  analog  of  a  curved  beam  element  dis¬ 
cussed  in  [2].  From  the  interpolation  standpoint  the  two  elements  are 
identical.  They  employ  so-called  anisoparametric  (i.e.,  distinct  de¬ 
gree)  kinematic  polynomials,  which  yield  proper  polynomial  representa¬ 
tions  for  the  membrane  and  shear  penalty  strains.  By  enforcing  the 
higher-degree  membrane  and  shear  penalty  modes  explicitly  (i.e.,  by 
insisting  upon  constant  variation  of  these  strains  along  the  element 
meridian),  a  two-node  configuration  having  three  d.o.f  at  each  node  is 
obtained.  This,  of  course,  implies  that  the  lowest  possible  integration 
order  (single-point  Guass  quadrature)  exactly  evaluates  the  respective 
strain  energy  contributions.  However,  normal  Gauss  quadrature  rules  are 
used  throughout  so  that  the  kinematic  reliability  of  the  element  is 
ensured  (i.e.,  the  only  zero  energy  modes  are  those  due  to  the 
rigid-body  motion)  along  with  the  variational  consistency  of  the  load 
vector  and  the  mass  matrix. 


Several  penalty  relaxation  ideas  are  discussed.  It  is  concluded 
that  a  single  penalty  relaxation  parameter  on  the  shear  stress  resultant 
(and,  hence,  the  shear  strain  energy)  can  effectively  be  employed  to  en¬ 
hance  the  strain  energy  approximation.  The  parameter  is  found  analyti¬ 
cally  by  a  strain  energy  matching  procedure. 


The  numerical  experiments  focus  on  the  dynamic  behavior  of  the  ele¬ 
ment;  specifically,  on  its  performance  in  free  vibration  problems.  Re¬ 
sults  are  presented  for  a  wide  range  of  shell  geometries  including  very 
thin,  moderately  thick,  shallow,  and  deep  axisymmetric  shells. 
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II.  SHALLOW  SHELL  EQUATIONS.  To  present  the  finite  element  ap¬ 
proach  in  a  clear  fashion,  we  shall  focus  exclusively  upon  the 
axisymmetric  linearly  elastic  shell  equations  in  which  the  effects  of 
shear  deformation  and  rotary  inertia  are  included  in  the  manner  of 
Naghdi  [9]  and,  furthermore,  the  meridian  curvature  effect  is  accounted 
for  using  the  shallowness  approximations  of  Marguerre  [10].  The  method¬ 
ology,  however,  is  general  enough  to  be  applicable  to  an  asymmetric 
shell-of-revolution  response  once  the  appropriate  shell  equations  are 
invoked . 

Consider  the  shallow  axisymmetric  shell  element  depicted  in  Figure 
1.  The  kinematic  variables  describing  the  axisymmetric  response  are  the 
middle-surface  membrane  displacement,  u(s,t)  (henceforth,  t  denotes 
time),  transverse  displacement,  w(s,t),  and  meridian  cross-sectional  ro¬ 
tation,  0(s,t).  Note  that  due  to  the  shallowness  assumption  [10],  which 
in  effect  is  a  perturbation  from  a  conical  (straight  meridian)  shell, 
these  variables  are  attributed  to  the  conical  surface  rather  than  the 
actual  curved  one.  As  a  consequence  of  this  simplifying  assumption,  all 
energy  integrals  are  carried  out  over  the  conical  surface. 

The  strain  and  curvature  components  may  be  written  as: 


Membrane  strains 


e  =  u,  +  wT,  w,  , 
s  s  Is  s 


e  =  (u  sin$  +  w  cos<J»)/r 
<P 


Bending  curvatures 


K  *  -0, 

s  s 


k  =  -0  sin$/r 


Shear  strain 


w,s-  0 


(2.1) 


(2.2) 


(2.3) 


where  s,  <p,  and  r  denote  the  shell  coordinates,  and  w^  describes  a  shal¬ 
low  meridian  shape  of  the  shell  (w^,  2  <<1).  Note  that  when  the  meridi¬ 
an  is  straight  (i.e.,  w^=0,  a  conixal  shell),  all  strains  (2.1)-(2.3) 
are  those  according  to  Naghdi  theory  [9]. 

The  corresponding  stress  resultants  are  related  to  the  strains 
through  the  constitutive  relations: 


where 


N  =  D  e,  M  =  D.  k ,  Q  =  D  y  (2.4) 

—  — m  —  —  — b  -  s 

*T-<N,.V’  st  -  lMs'V' 

iT «;T 

For  a  homogeneous  isotropic  shell  of  constant  thickness  h  the  constitu¬ 
tive  matrices  are: 


1“  r 
T7  -v  ’ 


D  I  , 

—v 


D  =  k2Gh  ( k 2=tt  2  / 1 2 ) , 

s 
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where  E,  G,  and  v  denote  Young's  modulus,  shear  modulus,  and  Poisson's 
ratio,  respectively;  k2  is  the  shear  correction  factor;  and  h  is  the 
shell  thickness. 


The  equations  of  motion  are  readily  derived  from  Hamilton's  varia¬ 
tional  principle: 

dj  Ldt  =  6  J  (i  j"[ph(u2  +  w2)  +  ph2 9 2 / 12 ]  2irrds 
t0  1 0 


-  |  j  [  MT<  +  N^e  +  Qy  ]  2irrds 


+  J  wq  2irrds}dt  =  0  (2.6) 

where  a  superior  dot  denotes  differentiation  with  respect  to  t,  p  is  the 
mass  density,  and  q  is  the  distributed  transverse  loading. 

III.  PENALTY  STRAIN  ISSUES  AND  INTERPOLATION  CONSEQUENCES.  The 
pivotal  issue  in  formulating  an  effective  finite  element  based  on  this 
theory  is  the  resolution  of  a  penalty  effect  engendered  in  the  thin 
shell-element  regime.  For  the  present  case,  we  distinguish  two  types 
of  penalized  strains  that  control  thin-regime  behavior  —  the  membrane 
meridian  strain  and  the  transverse  shear  strain.  With  l  being  fixed  and 
h->0,  the  thin  limits  of  membrane  inextensibility  and  shearless  (Poisson- 
Kirchhoff)  deformation  are  enforced  at  the  element  level.  The  pivotal 
constraints  take  the  form: 

Meridian  inextensibility:  eg=  u,g  +  w^,g  w,g  ->■  0  (3.3) 

Poisson-Kirchhof f :  y  =  w,  -  0  ->•  0  (3.4) 

-  's  s 

It  follows  that  this  constraining  action  reduces  the  number  of  indepen¬ 
dent  d.o.f.  by  at  least  two.  When  standard  isoparametric  schemes  are 
used  (i.e.,  uniform  kinematic  interpolation),  spurious  'locking'  con¬ 
straints  take  precedence,  making  an  element  extremely  stiff  [3].  Clear¬ 
ly,  the  lower-order  elements  are  most  susceptible  to  'locking'. 

The  desired  interpolation  requirements  are  that: 

(1)  the  polynomial  degrees  of  u,  w,  and  9  should  accommodate  consistent 
coupling  within  the  vanishing  strain  coefficients  (penalty  modes);  (2) 
the  number  of  penalty  modes  should  be  small  to  further  reduce  the  possi¬ 
bility  of  excessive  kinematic  constraining. 


Having  a  simple  two-node  element  as  our  goal,  the  C°  interpolation 
strategy  developed  in  [2]  is  invoked.  Considering  (3.4),  it  is  clear 
that  if  9=0(s)  (linear)  then  w  should  be  0(s2).  The  interpolation  for  u 
can  then  be  derived  from  the  requirement  posed  by  the  penalty  con¬ 
straints  of  (3.3).  Assuming  w^  is  cubic  (refer  to  Fig.  1), 


where 


Wj.(x)  =  30 2.(n  -  2n2  +  n3)  +  Bi&(n3  -  n2)» 


Bj*  wI»s(ni),  i=o,i  (n  *s/ £,c[o, l ] ) 


(3.5) 


it  follows  that  u=0(s4).  By  explicitly  enforcing  the  shear  and  membrane 
meridian  strains  to  be  constant  within  the  element,  in  a  manner  coinci¬ 
dent  with  the  curved  beam  formulation  [2],  the  desired  two-node  (six 
d.o.f.)  kinematic  field  is  derived: 


J.  X 

e  *  ^  Ni(n)  0^  w  =  ^  [Ni(n)  w ^  +  Ki(n2)  0i] 


(3.6) 


x 

u  ~  \  ui  +  L^3)  W.  +  Mi(n4)  0i]. 


where  expressions  for  the  shape  functions  can  be  found  in  [2].  The 
resulting  constant  eg  and  y  strains  are: 


where 


e  *  ;(ux-  uo)  +  3(© x-  0o ) » 


y  ■  -(wx  -  w0)  -  (0!  +  0 o )/2 . 


B  =  (Bi-  Bo)/12. 


(3.7) 


In  the  thin  limit  (es-*0,  y-O),  these  penalty  modes  ensure  the  desired 
kinematic  coupling. 

IV.  STRAIN  ENERGY  UPGRADING  VIA  PENALTY  RELAXATION.  The  preceding 
formulation,  utilizing  analytic  shell  equations  directly  into  a  finite 
element  variational  scheme,  may  be  regarded  as  conventional.  This 
common  approach  renders  the  membrane  and  shear  penalty  constraints  (3.3) 
and  (3.4)  enforceable  at  the  element  level,  consequently  requiring 
consistent  interpolations  [2,12]  or  related  strategies  [13,14]  to  over¬ 
come  the  thinness  limitations.  Although  in  one-dimensional  interpola¬ 
tion  models  such  strategies  are  generally  successful,  they  are,  in  fact, 
insufficient  in  three-dimensional  plate/shell  models,  where  boundary 
restraints  often  produce  shear  locking  [3,4]. 

Another  deficiency  of  the  conventional  approach  is  that  in  the  thin 
regime  the  stiffness  matrix  is  ill-conditioned.  In  addition  to  requir¬ 
ing  high-precision  computations,  the  ill-conditioning  causes  prohibi- 


r.HV 


tively  small  time  steps  in  explicit  transient  integrations  [15]  and,  as 
we  shall  see  further,  unrealistically  large  errors  in  the  higher  natural 
frequencies  and  corresponding  mode  shapes. 


We  therefore  view  the  conventional  approach  as  too  prohibitive  for 
generating  simple  and  effective  elements.  To  produce  well  behaved 
thin-regime  elements,  an  element  level  relaxation  of  penalty  constraints 
is  undertaken.  The  idea  is  to  introduce  a  correction  parameter  in  the 
element  constitutive  relations  that  would  account  for  the  limited 
kinematic  freedoms  of  penalty  strains  by  reducing  the  penalty  parameter 
in  the  thin  limit.  In  this  circumstance,  the  penalty  constraints  are 
said  to  be  relaxed  at  the  element  level.  Tbe  problems  of  locking, 
excessive  stiffening,  and  ill-conditioning  would  then  be  eliminated. 


Shear  relaxation.  For  clarity,  it  suffices  to  consider  the  shallowly 
curved  beam  [2],  possessing  the  same  basic  penalty  features  as  the 
present  shell.  To  illustrate  the  concept,  consider  a  curved  cantilever 
beam,  with  an  initial  cubic  shape,  loaded  at  its  free  end  by  both  mem¬ 
brane,  Nx,  and  shear,  Qx,  forces.  The  shear  relaxation  (correction  [7]) 
is  introduced  via  a  positive  parameter  $2  ,  which  appears  as  a  multipli¬ 
er  in  the  transverse  shear  constitutive  relations  of  the  element: 

.  2  r\  —  1 2  2n  k  /  /  i  \ 


k2GAY 


(4.1) 


Equating  the  strain  energy  captured  by  a  single  anisoparametric  curved 
beam  element  of  the  lowest  order,  (p«l)  [2],  with  the  exact  strain 
energy  for  this  problem,  and  solving  for  <f>2  results  in  the  same  expres¬ 
sion  as  in  the  straight  beam  case  [7]:  s 


>1  «  (1  +  Co) 

s  s  s 


(4.2) 


in  which 


a  =  3k2  |  U/h)2 

S  E 


(4.2-a) 


is  the  shear  penalty  parameter,  and  Cg  =  1/3. 


An  important  consequence  of  the  above  result  is  that  the  modified 
element  has  a  new  penalty  parameter 


a*1  =  a  *2  = 
s  s  s 


1  +  C  a 
s  s 


(4.3) 


with  the  following  desirable  properties: 


a  — ► 

s 


Cs  if  h-K)  (with  fixed  2,),  i.e.  thin  regime 


a  if  2,-K)  (with  fixed  h),  i.e.  thick  regime 
s 


In  the  thin  limit,  the  new  penalty  parameter  approaches  a  finite  value. 
This  implies  that  the  Poisson-Kirchhof f  constraint  is  relaxed  at  the 
element  level.  (By  contrast,  the  conventional  penalty  parameter  ap¬ 
proaches  infinity  in  this  case). 
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It  is  apparent  from  this  analysis  that  the  new  element  is  upgraded 
in  the  energy  sense  to  the  level  of  a  higher-order  element,  namely,  the 
second  order  element  (p=2),  which  happens  to  model  this  cantilever  beam 
problem  exactly. 

Membrane  relaxation.  In  [2],  in  addition  to  the  shear  relaxation 
parameter  $2g,  we  also  employed  the  membrane  relaxation  parameter,  ij>2  , 
which  served  as  a  multiplier  in  the  membrane  constitutive  relations  m 


d»2  N  =  .$2  AEe 
m  m 


having  the  form  analogous  to  that  of  $ 


2  . 


Pi 

m 


(1  +  C  a  )' 
m  m 


(4.4) 


(4.5) 


in  which  is  the  conventional  membrane  penalty  parameter 

a  “  -  12  (fU/h)2  (4.6) 
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The  constant  C  =1/4  was  established  on  the  basis  of  numerical  tests  to 
yield  an  overall  best  element  performance.  (Although  the  results  report¬ 
ed  in  [2]  were  based  on  the  correct  a  shown  in  (4.6),  the  fi  contribu¬ 
tion  in  was  typographically  omittedin  the  text).  It  was  found  that 
<t>m2  produced  only  minor  solution  improvements;  the  major  enhancement  was 

due  to  the  shear  relaxation,  <j>  2. 

s 

This  outcome  can  be  predicted  by  assessing  the  relative  strength  of 
the  two  conventional  penalty  parameters,  which  can  be  defined  by  the 
ratio: 


R  = 


(4.7) 


For  a  typical  isotropic  shallow  element  R£  0.01,  and  thus  a  is  at  least 
two  orders  of  magnitude  weaker  than  its  shear  counterpart.  mThis  implies 
that  much  of  the  penalty  related  stiffening  action  is  predominantly  due 
to  as»  Thus,  the  <p2  relaxation  of  the  inextensional  membrane  strain  is 
not  essential.  We  sfiall  further  highlight  many  of  these  issues  by  means 
of  numerical  examples. 

V .  NUMERICAL  RESULTS .  We  focus  our  numerical  studies  exclusively 
on  natural  vibrations  of  spherical  shells,  ranging  from  shallow  to  deep 
and  from  thin  to  moderately  thick.  Our  motivation  is  to  assess  the 
element  on  the  basis  of  its  dynamic  performance  by  examining  a  wide 
range  of  vibrational  modes.  The  computed  frequencies  are  compared  with 
available  analytic  and  finite  element  solutions. 

In  all  numerical  examples,  unless  stated  otherwise,  the  values  of 
E/G=2.6  and  k2=it2/12  were  assumed.  The  calculations  were  carried  out 
an  Apollo  DN3000  in  double  precision. 
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Anisopararntric  versus  Isoparametric  element.  In  this  study  we  estab¬ 
lish  an  appropriate  Gaussian  quadrature  rule  for  the  present  element 
without  shear  relaxation  (labelled  ANIS0*2)  and  compare  the  element 
performance  to  that  of  a  two-node  linear  isoparametric  element  (labelled 
IS0*2).  The  test  problem  is  a  deep/thin  clamped  hemispherical  shell 
(see  Figure  2d)  which,  due  to  its  thinness  (R/h=100)  and  deep  curvature, 
is  a  challenging  'locking’  test  for  this  class  of  shear-deformable 
curved  elements. 

Table  1  summarizes  the  ten  lowest  symmetric  frequencies  obtained 
with  a  16-element  IS0«2  discretization  using.  1-,  2-,  and  3-point 
Gaussian  quadrature.  The  results  are  compared  with  the  benchmark  fre¬ 
quencies  from  a  256 -element  ANIS0«2  model,  fully  integrated  with  3-point 
Gaussian  quadrature.  The  2-  and  3-point  quadrature  solutions  agree  very 
closely.  The  first  frequency  is  sufficiently  accurate,  however,  the 
higher  modes  experience  severe  stiffening  (locking)  as  evidenced  by 
their  overestimated  frequencies.  The  results  corresponding  to  the 
1-point  quadrature,  which  underintegrates  all  energy  contributions  (this 
curved  element  is  a  direct  analog  of  the  1 -point  quadrature  conical 
shell  of  Zienkiewicz  et.  al.  [16]),  produce  frequencies  converging  from 
either  below  or  above,  confirming  its  variational  inconsistency.  Again, 
the  highest  frequencies  are  noticeably  overestimated. 

By  contrast,  all  ten  frequencies  obtained  with  ANIS0»2  (see  Table 
2)  using  2-  and  3-point  quadratures  are  highly  accurate.  The  results 
based  on  the  1 -point  quadrature,  which  exactly  integrates  strain  energy 
contributions  due  to  constant  strains  (e  ,y)  and  curvature  (k  ),  are 
slightly  even  more  accurate,  converging  Consistently  from  above  (the 
convergence  study  is  not  shown).  However,  further  studies  must  be 
carried  out  to  verify  the  reliability  of  ANIS0»2  with  the  1-point  inte¬ 
gration.  Henceforth,  the  2-point  quadrature  will  be  used  to  integrate 
the  ANIS0«2  element. 

Cs  constant.  A  suitable  C  value  can  be  established  by  insisting  upon 
monotonic  convergence  of  vibration  frequencies  from  above,  a  property 
which  is  intrinsic  to  conforming  displacement  models.  For  this  purpose, 
taking  into  account  that  <p2  is  independent  of  f5.,  it  suffices  to  seek 
frequencies  of  vibration  of  a  circular  plate  (see  ^Figure  2a).  In  Figure 
3,  the  error  of  the  first  symmetric  frequency  of  a  clamped  circular 
plate  (R/h=100)  is  plotted  versus  the  number  of  elements;  where  the  four 
curves  correspond  toC  =0,  1/20,  1/8,  and  1/5.  The  best  results  are 
obtained  with  C  =1/8.  Henceforth,  this  value  is  adopted  for  the  ele¬ 
ment,  labelled  ANIS0«2<J>.  Note  that  the  beneficial  effect  of  C  is 
especially  pronounced  in  the  coarse  models.  s 

Further  evidence  of  the  effect  of  shear  relaxation  is  illustrated 
in  Figure  4,  where  the  first,  third,  and  fifth  symmetric  frequencies  of 
ANIS02  (i.e.,  Cg=0)  are  normalized  with  the  corresponding  ANIS0*2(J> 
results.  It  is  seen  that  the  higher  frequencies  ben  fit  the  most  from 
the  shear  relaxation. 

Shallow  shells.  Tables  3  and  4  summarize  the  ANIS0*2  and  ANIS0*2<J> 
results,  respectively,  for  the  first  five  symmetric  frequencies  of  a 
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thin  (R/h=100)  and  moderately  thick  (R/h=10)  10-degree  clamped  spherical 
shell  (see  Figure  2b).  The  256-element  benchmark  frequencies  and  those 
obtained  by  a  modified  Holzer  method  [17]  are  cited  for  comparison 
purposes.  The  frequencies  obtained  with  the  ANIS0«2$  elements  are 
consistently  lower  than  those  of  ANIS0*2;  hence,  they  are  more  accurate, 
since  the  convergence  is  from  above.  The  results  are  also  superior  to 
those  reported  in  [17].  Note  that  the  effect  of  shear  relaxation  is 
particularly  beneficial  in  coarsely  discretized  models  and  higher  vibra¬ 
tional  modes.  The  diminishing  influence  of  the  shear  relaxation  parame¬ 
ter  is  noticeable  as  the  mesh  is  further  refined. 

Deep  shells.  Tables  5  and  6  contain  the  ANIS0*2  and  ANIS0*2$  results, 
respectively,  for  the  first  five  symmetric  frequencies  of  a  thin 
(R/h=100)  and  moderately  thick  (R/h*10)  clamped  hemispherical  shell  (see 
Figure  2d).  Again,  the  256-element  benchmark  frequencies  and  those 
obtained  by  a  modified  Holzer  method  [17]  are  cited  for  comparison 
purposes.  The  ANIS0*2$  frequencies  are  consistently  more  accurate  than 
those  of  ANIS0»2  and  those  reported  in  [17]. 

To  benchmark  the  element  behavior  further,  we  compared  ANIS02 <p 
with  two  commonly  used  isoparametric  axisymmetric  shell  elements  from 
the  ABAQUS  finite  element  program  [23],  SAX1  (2-node,  linear)  and  SAX2 
(3-node,  quadratic).  Both  of  the  ABAQUS  elements  use  reduced  integra¬ 
tion  on  the  shear  energy,  and  a  shear  relaxation  parameter  of  the  form 
somewhat  different  than  the  present  one.  Figure  5  shows  the  percent 
error  for  the  first  ten  symmetric  frequencies  using  a  48-d.o.f.  model 
for  the  thin,  clamped  hemispherical  shell.  Whereas  ANIS0»2<Ji  performs 
consistently  well,  SAX1  produces  rather  poor  frequencies  thoughout,  and 
SAX2  begins  to  deteriorate  at  higher  frequencies.  In  addition,  unlike 
ANIS0«2<|>,  SAX1  and  SAX2  do  not  converge  monotonically  --  some  frequen¬ 
cies  converge  from  above,  while  others  converge  from  below. 

Sixty-degree  shell.  Our  motivation  for  analyzing  the  clamped  60-degree 
shell  (R/h=20)  (see  Figure  2c)  was  to  compare  the  present  element  re¬ 
sults  with  several  others  [18-20].  Table  7  contains  frequencies  for  the 
first  eight  symmetric  modes  of  vibration.  A  24-element  model  was  used. 
The  present  elements  produce  consistently  lower  frequencies,  and  because 
they  converge  from  above,  they  are  of  superior  accuracy.  Of  particular 
interest  is  the  steady  progression  of  improvement  as  the  element  is 
upgraded  from  ANIS0*2  to  ANIS0*2$. 

Explicit  integration.  In  this  example  we  illustrate  an  often  neglected 
attribute  of  penalty  relaxation.  In  the  explicit  conditionally  stable 
transient  integration,  the  critical  time  step  is  bounded  by  the  inverse 
of  the  largest  natural  frequency  of  the  individual  elements  (me  ; 
e.g.,see  [15]).  Figure  6  depicts  the  normalized  critical  time  step 

At  .  =  2c/ to)6  (c=/E Ip  —  bar-wave  velocity) 
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for  a  single,  lumped  mass  element  (p0=-81=ir/64)  plotted  versus  2,/h  . 
While  the  penalty-relaxed  solution  (ANIS0«2<|>,  C  =0.125)  is  bounded  by  a 
constant  (Atcr^t=0.257)  as  H/h-**>,  the  standa/d  formulation  (ANIS0*2, 
Cg=0)  exhibits  an  exponential  decline,  falling  several  orders  of  magni¬ 
tude  below  the  results  for  the  relaxed  case.  This  example  dramatically 


illustrates  the  enormous  computational  efficiency  that  can  be  achieved 
by  shear  relaxation.  Notably,  other  methods  for  enhancing  thin-regime 
behavior  (e.g.,  [21,22])  do  nothing  to  improve  on  the  poor  critical  time 
step  performance  of  the  standard  (unrelaxed)  element. 

Extreae  thinness  regime.  All  shell  problems  presented  herein  were 
solved  using  an  extremely  thin  shell  geometry  (R/h»106).  No  locking  of 
any  type  was  observed  in  this  extreme-thinness  regime. 

VI .  CONCLUDING  SUMMARY .  We  have  presented  a  shallowly-curved 
axisymmetric  shell  element  which  includes  the  effects  of  shear  deforma¬ 
tion  and  rotary  inertia.  In  our  displacement  formulation,  we  focus 
particular  attention  upon  the  anisoparametric  interpolations,  shear 
relaxation,  and  low-order  numerical  intergartion.  The  result  is  a 
simple  and  efficient  two-node  shell  devoid  of  shear  and  membrane 
locking,  having  no  thinness  limitations.  In  addition,  shear  relaxation 
(correction)  of  the  shear  penalty  improved  coarse-mesh  behavior  and 
produced  an  element  of  superior  efficiency  in  explicit  time  integration. 

We  regard  this  element  as  an  excellent  candidate  for  large-scale 
computations,  nonlinear  applications,  time  integration  procedures,  and 
microcomputer  implementation.  Finally,  the  present  methodology  appears 
ideally  suited  for  application  to  general  shell  models. 
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Figure  1. 


Figure  2.  (a)  Clamped  circular  plate;  (b)  Shallow  10  deg.  shell; 

(c)  Deep  60  deg.  shell;  (d)  Deep  hemispherical  shell. 
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*  NATURAL  FREQUENCY 

Figure  5.  Vibration  of  clamped,  thin  hemispherical  shell  (R/h*100, 
v=0.3);  comparison  of  ANIS0*2$  with  SAX1  and  SAX2  of  ABAQUS  for  the 
ten  lowest  symmetric  frequencies. 


i/h 

Figure  6.  Normalized  critical  time  step  vs.  (l/h)  for  a  shallow  shell; 
comparison  of  shear-relaxed  (ANIS0*2$)  and  unrelaxed  (ANIS0*2)  models. 
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TABLE  2.  A  study  of  Gaussian  integration  for  ANIS02; 
symmetric  nondimensional  natural  frequencies 

[pR3(l-vJ)/E]^  for  clamped,  thin  hemispherical  shell 
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TABLE  3.  Nondimens tonal  synsnetric  natural  frequencies  [pR’(l-\ 
clamped  shallow  (10  deg),  thin  spherical  shell  (R/h-100,  v-0.3) 
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clamped  shallow  (10  deg),  moderately  thick  spherical  shell  (R/h»10,  v»0.3) 


No. 

Shear  relaxation 

o  ^  2 

Mode  number 

of 

el. 

s 

’  a 

1 

2 

3 

4 

5 

4 

0 

0.125 

1 

0.978 

6.0209 

5.9858 

15.5464 

15.4274 

22.3395 

22.3390 

27.3698 

27.1741 

31.1135 

30.9162 

8 

0 

0.125 

1 

0.994 

5.9753 

5.9666 

14.9272 

14.8989 

22.0599 

22.0597 

25.2442 

25.1942 

30.1047 

30.0633 

16 

0 

0.125 

1 

0.999 

5.9638 

5.9617 

14.7707 

14.7637 

2)  .9879 
21.9878 

24.6198 

24.6074 

29.8900 

29.8799 

64 

0 

0.125 

1 

1.000 

5.9602 

5.9601 

14.7214 

14.7210 

21.9650 

21.9650 

24.4178 

24.4170 

29.8236 

29.8229 

BENCHMARK 

256 

□ 

0.125 

1 

5.9600 

14.7183 

21.9636 

24.4051 

29.8194 

e 


if: 


5! 


I 


w 

rhi 


i*. 


£: 


i 


I 


yi 

•»« 

v 


ft 

& 


li 

$ 


TABLE  5.  Nondimensional  symmetric  natural  frequencies  Uj  [pR!(l-vJ)/E]^  for 
clamped  hemispherical  thin  shall  (R/h“100,  v«0.3) 
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TABLE  6.  Nondimens ional  synmetric  natural  frequencies  [pRJ(l-v5 
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Table  7.  Nondimensional  natural  frequencies  0  «  u  R  (p/E)^  for  60  deg. 
clamped  spherical  shell  (R/h“20,  v»0.3). 
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A  BLOCK  QR  FACTORIZATION  SCHEME  FOR  LOOSELY 
COUPLED  SYSTEMS  OF  ARRAY  PROCESSORS 


Charles  Van  Loan 
Department  of  Computer  Science 
Cornell  University 
Ithaca,  New  York  14853 


Abstract 

A  statically  scheduled  parallel  block  QR  factorization  procedure  is 
described.  It  is  based  on  ’block’  Givens  rotations  and  is  modeled  after  the 
Gentleman-Kung  systolic  QR  procedure.  Independent  tasks  are  associated 
with  each  block  column.  ’Tal'est  possible'  subproblems  are  always 
solved.  The  method  has  been  implemented  on  the  IBM  Kingston  LCAP-I 
system  which  consists  of  ten  FPS-I64/MAX  array  processors  that  can 
communicate  through  a  large  shared  bulk  memory.  The  implementation 
revealed  much  about  the  tradeoff  between  block  size  and  load  balancing. 
Large  blocks  make  load  balancing  more  difficult  but  give  better  164/MAX 
performance  and  less  shared  memory  traffic.  The  results  obtained 
indicate  that  our  approach  to  parallelizing  the  QR  factorization  is 
competitive  for  very  large  problems,  e.g.,  of  the  order  50GQ-by-l0Q0. 
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Kingston  and  on  the  Production  Supercomputer  Facility  at  Cornell  which  is  supported  in  part 
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I.  Introduction 


Computing  ms  QR  factorization  of  a  matrix  Ae  R1™"  involves  finding 
an  orthogonal  matrix  Q  e  R™*™  and  an  upper  triangular  matrix  R  e  R^ 
such  that  A  =  QR.  This  factorization  has  a  prominent  role  to  play  in 
numerical  linear  algebra  especially  because  of  its  bearing  on  the  least 
square  problem.  A  detailed  description  of  the  QR  factorization  and  the 
various  ways  that  it  can  be  computed  may  be  found  in  Golub  and  Van  Loan 
(1983). 

Parallel  methods  for  computing  the  QR  factorization  have  received 
considerable  attention  recently.  For  systolic  arrays  attention  has 
focussed  on  methods  that  rely  on  Givens  rotations.  See  Gentleman  and 
Kung  (1981)  or  Heller  and  Ipsen  (1983).  Dongarra,  Sameh,  and  Sorenson 
(1986)  have  implemented  both  parallel  Givens  and  parallel  Householder 
procedures  on  the  DenelcorHep. 

In  this  paper  we  discuss  a  block  version  of  the  Gentleman- Kung  method 
that  we  have  implemented  on  the  IBM  Kingston  LCAP-1.  This  system 
consists  of  ten  FPS-164  array  processors  (APs)  that  can  communicate 
through  several  shared  bulk  memories.  An  overview  of  LCAP-I  is  offered 
in  Clementi  and  Logan  (1985).  The  features  of  LCAP-1  that  figure  in  the 
current  work  are  depicted  in  the  following  diagram: 


There  are  actually  two  levels  of  parallelism  here  because  the  APs  are 
each  capable  of  performing  twenty  parallel  dot  products.  Indeed,  the 
FPS-164,  max's  at  Kingston  each  come  equipped  with  two  "MAX  boards’. 
The  Tax  :oard  enhancement  enables  each  AP  to  perform  matrix-matrix 
multip  canon  at  a  peak  rate  of  55  flflops  il  the  matrices  involved  are 
sufficiently  large.  Full  exploitation  of  the  FPS-!64/f1AX  requires  having 
an  algorithm  that  is  rich  in  matrix  multiplication.  This  is  why  we  have 
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chosen  to  develop  a  parallel  block  procedure.  The  blocking  of  the  matrix 
A  is  largely  a  function  of  the  164/MAX  architecture.  For  example,  it  turns 
out  to  De  efficient  to  have  block  columns  that  are  a  multiple  of  twenty 
simply  because  the  LCAP-1  APs  can  g2£h  perform  twenty  parallel  dot 
products.  Further  details  concerning  the  FPS-164/MAX  architecture  may 
be  found  in  Charlesworth  and  Gustafson  (1986). 

The  matrix  A  is  stored  in  a  64  Mword  bulk  memory  unit  manufactured 
by  Scientific  Computing  Associates  (SCA).  Thus,  a  dense  problem  of  size 
!6K-by-4K  could  potentially  be  solved.  The  APs  have  approximately  600 
Kwords  of  usable  memory.  This  is  enough  to  house,  for  example,  a 
l000-by-500  submatrix. 

Oata  between  the  APs  and  the  bulk  memory  flows  at  a  rate  of  44 
Mbytes/sec.  However,  high  latency  associated  with  each  transferred 
message  demands  that  data  be  moved  in  fairly  good-sized  churls  in  order 
to  be  efficient,  e.g.,  1000  words. 

Additional  nuances  of  the  LCAP-I  system  as  they  apply  to  our  QR 
implementation  are  detailed  later. 

This  paper  is  the  first  of  several  reports  in  which  we  explore  the 
issues  associated  with  parallel  matrix  computations  on  the  LCAP-I.  The 
parallel  block  QR  factorization  scheme  that  we  encoded  is  derived  in  §2 
and  §3.  Implementation  details  are  covered  in  §4  and  results  in  §5.  Our 
current  QR  code  can  be  improved  in  several  ways  as  we  often  opted  for 
the  "easy  way  out"  when  confronted  with  an  algorithmic  dilemma. 
Despite  this  we  feel  that  our  LCAP-1  experience  offers  general 
perspectives  on  large  scale  distributed  matrix  computations. 
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2.  Parallel  Givens  QR 


We  sag  mat  G  e  R™*77  is  an  adjacent  Givers  rotation  in  planes  i-l  and  i 
if  G  is  the  identity  with  the  following  2-by-2  exception: 


9h,h  9j-y 

- 

cos(0)  sin(0) 

.  9j,j-i  9h 

_-sin(9)  cos(9). 

Notice  that  G  is  orthogonal  and  that  premuitiplication  by  G  affects  just 
rows  i-l  and  i  .  If  x  e  Rm  then  it  is  not  hard  to  determine  (cos(0),sin(e)) 
so  that  y j  =  0  if  y  =  Gx  .  These  and  other  Givens  rotations  issues  are 

discussed  in  Golub  and  Van  Loan  (1983,  pp.43-47). 

Adjacent  rotations  are  important  because  they  only  combine  adjacent 
rows  or  columns  when  applied  to  a  matrix.  Moreover,  they  can  be  used  to 
compute  the  QR  factorization  of  a  matrix.  Assuming  A  e  R1™1  (m  >  n) 
we  have: 

Algorithm  2.1 


For  j  =  ]:n 

For  i  =  m  :  -!  :  j+l 

Determine  an  adjacent  Givens  rotation  Gy  such 

that  if  y  =  G,jTA(:  ,  j:j)  then  yf  =  0,  i.e.,  zeroa^  . 
A  :=  Gjj  rA 

end  i 

end  j 

Upon  completion  A  is  overwritten  by  R  and 


Q 


®2l)  - 


'm,n  min 


Notice  that  the  algorithm  computes  R  column-by-column  and  that  the 
zeroing  within  a  column  proceeds  from  the  bottom  up  to  the  subdlagonai. 
Here  is  a  depiction  of  the  4-by-3  case: 
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XXX  XXX  xxx 

XXX 

XXX  XXX  XXX  8 

XXX  _»  xxx_  x  x  x 

OXX 

^  OXX  ^  OXX  ^  OXX  j 

XXX  XXX  0  X  X 

OXX 

OXX  OOX  OOX  0 

XXX  OXX  0  X  X 

OXX 

OOX  OOX  OOO  IS 

To  indicate  the  inherent  parallelism  in  this  procedure  we  resort  to  a  1 

slightly  larger  example  and  number 

the  a,j  in  the  order  that  they  are  t 

1 

zeroed: 

! 

x 

X 

X 

X  I 

8 

X 

X 

X  s 

7 

15 

X 

X 

6 

14 

21 

X 

1 

5 

13 

20 

26  m  =  9,n  =  4  1 

B 

4 

12 

19 

25  \ 

£ 

3 

11 

18 

24  5 

s 

2 

10 

17 

23  j 

1 

9 

16 

22  1 

Recognize  that  the  computation  and  application  of  Gjj  can  begin  as  soon  f 

1 

as  Gj_j  is  applied  to  A.  To  illustrate  this  we  tabulate  the  earliest  | 

"time  step"  that  aij  (i>j)  can  be  zeroed:  | 

1 

X 

X 

X 

x  '  | 

i 

8 

X 

X 

X  I 

7 

9 

X 

x  1 

3 

6 

8 

10 

x  € 

s 

5 

7 

9 

11  m  =  9,  n  =  4  { 

v 

4 

6 

8 

10  5 

r 

3 

5 

7 

9  j 

5 

2 

4 

6 

8 

1 

1 

3 

5 

7 

i 

witn  this  notation  we  see  in  the  example  that  four  Givens  updates  can  be  j 

performed  during  the  seventh  time  step:  G31 .  G52,  G73,  and  G94.  If  we  had  j 
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4  processors  then  they  could  each  be  assigned  one  of  these  tasks. 

The  parallelism  that  we  have  exposed  in  the  above  example  can  be 
formalized  by  rearranging  the  loop  indexing  in  Algorithm  2.1  and  noting 
that  rmn-2  timesteps  are  required. 


Algorithm  2.2 


For  k  =  I:  m*n-2 
For  All  j  =  Im 

i  =  m-k^l+2(j-l) 
if  ( i  <  m  &  i  >  ]♦!) 


Determine  G,j  to  zero  a,  j 


A GjjT  A 


end  j 


endk 


The  "For  Air  statement  reminds  us  that  all  of  the  updates  A  :=  Gjj  ta 


associated  with  a  given  time  step  k  are  independent  and  can  be  performed 
in  parallel. 


We  point  out  that  Gfj  can  actually  be  computed  "earlier"  than  we  have 


indicated.  For  example,  in  the  (m,n)  s  (9,4)  case  above,  we  have  assumed 
that  G92  is  computed  as  soon  as  Gg)  has  been  applied  all  the  way  across 
the  matrix.  In  fact,  G82  can  be  computed  as  soonas  G8J  has  been  applied  to 
just  the  second  columa  For  reasons  mat  we  give  in  §4,  we  have  not 
implemented  the  "soonas  possible"  generation  of  G(| . 

Algorithm  2.2  and  its  natural  variants  can  be  mapped  nicely  crto 
systolic  networks.  See  Heller  and  Ipsen  (1963). 
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3.  A  Parallel  Block  QR  Factorization  nethod 


Some  notation  is  required  Del ore  a  block  version  of  Algorithm  2.2  can 
be  specified.  Partition  A£Rffl1*1  as  follows: 


A  = 


m, 


(3.1) 


^  J  "V 


Here,  Ajj  is  m,-by-nj  and  we  assume  that  m|  >  nj  for  all  i  and  ]  .  If  Q  is 
an  orthogonal  matrix  of  dimension  m^,  ♦  m,  then  we  refer  to 


Gj(Q)  -  diag(  1^  •  ~  •  *mj.2  *  ^ 


as  an  adjacent  "block  Givens”  rotation  in  block  planes  H  and  i 


Algorithm  3.1  (Block  Givens  QR  Factorization) 


For  k  =  1:  p+q-2 
For  All  j  =  l:q 

i  =  p-k+l+2(j-l) 
if  ( i  <  p  &  i  >  j*1  ) 

Determine  orthogonal  such  that 


V 


*4 


^i 


i  'J 


R 
L  o  J 


(R  upper  triangular) 


Set  Gjj  »  Gt(QH  )  and  update  A:=  G^A 


end 


end  j 


end  k 
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This  procedure  is  identical  to  Algorithm  2.2  except  that  blocks  are  zeroed 
instead  of  scalars.  Upon  completion  A  is  overwritten  with  a  block  upper 
triangular  matrix  R.  unless  all  tne  a,j  are  square,  then  R  will  not  be  upper 

triangular  as  a  scalar  matrix.  For  example,  if  the  partitioning  in  (3.1)  is 
defined  by  (mltm2)  s  (3,3)  and  (n,/^)  =  (2,2)  then  Algorithm  3.1 
overwrites  A  with 


x  x  x  x 
0  x  x  x 
0  0  x  x 
0  0  x  x 
0  0  0  x 
0  0  0  0 


Of  course,  it  is  possible  to  upper  triangularize  this  matrix  with  rurther 
Givens  operations,  but  that  is  an  annoying  but  necessary  follow-up 
computation 

However,  there  is  a  more  serious  problem  associated  with  rectangular 
blocks.  Consider  the  example  (m^.mj.m*)  =  (2,3,3,8)  .  (ry^)  =  (2,2).  At 
the  beginning  of  the  second  time  step  A  looks  like 


X  X'X  X 


X  X  X  X 


X  X  X  X 


x  X  :  x  x 


x  x  ;x  x 


x  x  ;x  < 


0  x|  x  x 
0  0!  x  x 
0  0!n 
0  0  x  x 


0  0  x  x 
0  0  x  x 


0  0  x  x 
0  0  x  x 


0  0  x  x 
0  O'  x  x 
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At  this  stage.  Algorithm  3.1  specifies  that  we  only  upper  triangularize 
the  submatrix  A(3:8,l:2),  i.e.,  the  subproblem  defined  by  blocks  a21  and 
a3j.  However,  we  see  from  the  figure  that  a  significant  amount  of  zeroing 
in  the  second  block  column  can  take  place  concurrently.  In  particular,  we 
could  upper  triangularize  both  A(3:8,l:2)  2Q0.  A(9:I6, 3:4)  . 

In  general,  because  the  *bottom*  submatrix  A,j  in  each  subproblem  is 

upper  triangular,  ’taller*  submatrices  can  be  upper  trianguiarized 
throughout  Algorithm  3.1.  In  order  to  rearrange  this  algorithm  so  that 
’maximally  tail’  subproblems  are  solved  at  each  stage,  we  need  to  drop 
the  fixed  row  blocking  in  (3.1).  We  continue  to  assume  that  A  has  q  block 
columns  with  widths  n,  _  n^ .  However,  instead  of  imposing  a  fixed 

blocking  of  A's  rows  we  have  chosen  to  determine  the  "height*  of  the 
subproblems  through  an  integer  parameter  mo  that  satisfies  m0  >  n, .  In 
our  scheme,  the  subproblems  in  the  first  block  column  involve  at  most  mo 
rows.  Maximally  tall  subproblems  are  then  solved  in  subsequent  block 
columns  at  each  step.  To  illustrate,  consider  the  case  m  =  100,  mo  =  20, 
and  (1vi2.n3.n4)  =  (2,3,5,5): 


Suboroblem  Row  Ranges 


Time  Step 

1:2 

Column  Ranges 

3:5  6:10 

11:15 

1 

81:100 

- 

- 

- 

2 

63:82 

83:100 

- 

- 

3 

45:64 

65:85 

86:100 

- 

4 

27:46 

47:67 

68:90 

91:100 

5 

9:28 

29:49 

50:72 

73:95 

6 

1:10 

11:31 

32:54 

55:77 

7 

- 

3:13 

14:36 

37:59 

8 

- 

- 

6:18 

19:41 

9 

- 

- 

- 

11:23 

In  general  four  integers  rowsrt(t.j),  rowend(t.j),  coisrt(j),  and  cciend(j) 
are  necessary  to  describe  subproblem  (t,j),  e.g.,  29,  49,  3,  and  5  for 
subproblem  (5,2).  These  index  arrays  and  the  total  number  of  time  steps 
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tf  required  can  be  computed  as  follows: 


Algorithm  3.2 

Let  m,  n,  m0,  q  and  me  column  partitioning  (n^..j>  be  given  with  m>n 
and  nr^n,.  This  algorithm  determines  tf  and  the  index  arrays  colsrt(l:q), 
colend(l:q),  rowsrt(1:t  f  .hq).  and  rowend(l:t  f  ,l.*q). 

tf  =  ceiling(  max(0,m-mo)  /  (rrvj-n,)  )  ♦  q 
Fort  =  1  :  tf 

if  t  =  1 

For  j  =  l:q 
if  j=l 

colsrt(l)  =  1 
colend(l)  =  n, 

rowsrt(tj)  =  max(  1  ,  m-m0*t) 
else 

coisrt(j)  *  colend(j-l)  ♦  I 
colend(j)  s  colendCj-l)  ♦  nj 

rowsrt(t,j)  =  m 
end 

rowend(t,j)  =  m 

end  j 
else 

For  j  =  1:q 
if  j=l 

rowend(t,l)  =  rowsrt(t-U)  ♦  n,  -  I 
rowsrt(t.l)  =  max(l,  rowend(t,l)-m0*l  ) 
else 

rowend(t.j)  =  min(  rowsrt(t-l.j)  *  nj  -  I  ,  m  ) 

rowsrt(t.j)  =  max(  colsrt(j),  min(  rowend(t.j-l)  ♦!,  m)) 
end 

end  j 
end 
end  t 
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A  couple  of  comments  are  in  order.  In  block  column  1.  the  subproblems 
"climb'  at  the  "rate"  m0  -  n,  and  so  I  ♦  ceilingf  max(o.m-mo)/(m0-n,)  ) 
steps  are  required  to  complete  the  processing  of  block  coium  I. 
Thereafter  one  block  column  per  time  step  is  completed.  This  explains  the 
formula  for  and  why  we  must  have  m0  >  n, . 

In  block  column  j  .  "serious"  computation  does  not  begin  so  long  as 
rowsrt(t.j)  =  rowend(t,j)  =m.  After  block  column  j  is  fully  triangulated, 
rowsrt(t,j)  =  colsrtfj)  and  rowend(t.j)  =  colendfj),  conditions  that 
normally  signal  that  there  is  "nothing  to  do"  in  block  column  j  .  (An 
exception  occurs  when  rowsrt(t,j)  =  colsrt(j)  and  rowend(t,j)  =coiend(j) 
=  m.) 

With  subproblems  specified  by  Algorithm  3.2  we  can  now  describe  the 
overall  factorization  procedure. 

Algorithm  3.3  (Maximally  Tall  Block  Givens  QR  Factorization) 

Given  m,  n,  m0,  q,  the  column  partitioning  (n,^n  with  m  >  n  and  m0 

>  r^ ,  the  following  algorithm  overwrites  A 6  R"*1  with  upper  triangular  R 
=  QtA  where  Q  is  orthogonal.  ( 

Compute  tf ,  rowsrt(i:t  f  ,!:q)  ,  rowend(l:t  f  ,l:q), 

colsrt(hq)f  and  colend(l:q)  using  Algorithm  3.2 
For  t  =  I  :  tf 

For  j  =  l:q 

it  =  rowsrt(t,j) 
i2  =  rowend(t,j) 
jt  =  colsrt(j) 
j  2  =  colend(j) 

if  ( i,  =  i2  =  m  or  ( i,  s  jt  &  i2  =  \2  &  h  x  m  ) ) 

"Nothing  to  do." 
else 

Compute:  A(i,:i2,j,:j2)  =  QR  . 

Apply:  A(i1:i2.j,:n)  :=  QrA(i,:i2,j,:n) 

end 
end  j 

end  t 
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4.  Implementation 


In  this  section  we  discuss  three  issues  associated  with  the 
implementation  of  Algorithm  3.3  on  the  LCAP-I  system:  now  a  is 
arranged  in  shared  memory,  how  the  subproblems  are  solved,  and  how 
block  column  tasks  are  mapped  onto  processors. 

The  Storage  of  A 

At  time  step  t.  the  relevant  row  and  column  delimiters  for  the  j-th 
subproblem  are  ii  =  rowsrt(t.j).  i2  =  rowencKt.j),  jf  =  colsrt(j).  and  j2  = 
coiend(j).  Here  is  what  the  array  processor  in  charge  of  this  subproblem 
must  accomplish: 

1.  Read  A(  i1:i2, j2)  from  shared  memory. 

2.  Compute  an  orthogonal  Q  such  that  QWii^ji^)  =  R  is 
upper  triangular. 

3.  Write  the  updated  A(i,:i2Ji.j2)  back  into  shared  memory. 

4.  Read  AOfyt+fcn)  from  shared  memory. 

5.  Apply  Qt  to  A(ivi2.h*l:n)  . 

6.  Write  the  updated  AOfi^fe+fcn)  back  into  shared  memory. 

we  assume  that  A(i,:i2,j,:j2)  can  fit  into  local  memory  but  that  because  of 
its  size,  the  processing  of  AGi^fe+fcn)  may  have  to  proceed  in  "chunks". 
That  is,  steps  4-5-6  may  have  to  be  repeated  with  a  manageable  segment 
of  columns  from  A(i1:i2,j2»l:n)  each  time.  Note  that  Q  stays  in  the  AP 
during  this  process.  Because  one  AP  is  responsible  for  applying  a  given  Q. 
there  is  no  need  to  pass  Q  on  to  another  AP. 

There  is  an  overhead  associated  with  traffic  to  and  from  shared 
memory.  Reads  and  writes  to  snared  memory  are  accomplished  with  a 
"move"  command  and  can  oniy  involve  contiguous  portions  of  memory. 
Using  move  to  transfer  n  floating  point  words  takes 

T(n)  =  (100  ♦  3n/44 )  jisec 

Note  that  the  too  psec  startup  degrades  the  44mb/secpeak  transfer  rate. 
Thus,  a  vector  of  length  1000  takes  281  jisec  to  move  for  an  effective 
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data  transfer  rate  of  28  mb/sec. 

From  the  standpoint  of  processing  the  subproblem  at  hand,  it  would  be 
ideal  if  A(i,:i2,jvn)  was  contiguous  in  shared  memory  for  then  a  minimum 
number  of  moves  would  be  required  to  carry  out  steps  1,3,4,  and  6  above. 
For  example,  to  read  a  contiguous  t000-dy-500  submatrix  from  shared 
memory  would  require  T(500,00Q)=  .09  sec(*  44mb/sec).  Unfortunately, 
storing  by  blocks  in  Algorithm  3.3  would  impose  significant  buffer 
requirements  and  some  tedious  data  manipulation  within  each  A P.  The 
buffer  issue  is  fairly  important  because  the  AP's  we  used  have  limited 
local  memory  (*  600  Kwords). 

Because  we  didn't  want  additional  buffer  requirements  to  limit  further 
the  size  of  •working'  memory  we  chose  to  store  A  in  column  major  order. 
This  implies  that  r  moves  are  required  to  move  a  submatrix  with  r 
columns.  Thus,  to  read  a  1000-by-500  submatrix  requires  500*T(1000)  = 
.14  sec (*28  mb/sec).  This  is  actually  a  typical  size  fora  submatrix  move 
in  our  algorithm.  When  the  overall  implementation  is  considered,  we  can 
easily  live  with  a  28  mb/sec  data  transfer  rate. 

Subproblem  Solution 


The  basic  computation  in  Algorithm  3.3  consists  of  computing  a  QR 
factorization  and  then  applying  the  resulting  orthogonal  matrix  to  the 
’rest  of  A'.  The  normal  Unpack'  way  to  compute  a  QR  factorization  of 
a  matrix  C  i  Rrnoxno  is  to  use  Householder  matrices.  A  Householder 
matrix  is  an  orthogonal  transformation  of  the  form 

P  =  I  -  2vvt  v «  Rmo  ,  |  v  |2  =  1  . 

In  the  Unpack  QR  procedure  Householders  Ph ...  ,  are  generated  so  that 
P^  -  P,C  =  R  is  upper  triangular.  Note  that  Q  =  P,«  P^ . 

We  now  consider  the  computation  QrB  where  B  is  some  matrix.  If  Q 
is  represented  as  a  product  of  Householders,  then  the  resulting  algorithm 
is  ’rich*  in  matrix-vector  multiplications.  This  is  fine  for  many 
architectures.  However,  to  exploit  fully  (he  FPS-I64/MAX  architecture, 
we  need  an  update  algorithm  that  is  rich  in  matrix-matrix  muitipncatioa 
We  could  accomplish  this  by  explicitly  forming  the  product  Q  =  Pr  P^ 
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before  applying  it  to  8.  But  this  would  be  very  costly  since  nrig  » 
usually.  An  unacceptably  large  m0-by-m0  buffer  would  also  be  required  by 
this  approach. 

instead,  we  have  cnosen  to  use  the  ’WY"  representation  for  products 
of  Householder  matrices  that  is  developed  in  Bischof  and  Van  Loan  (1985). 
In  this  scheme  m0-by-no  matrices  W  and  Y  are  generated  sucti  that 

Q  =  pj  -  Pno  =  1  *  WYt 

The  ensuing  update  B  :=  QrB  =  (I  ♦  WYT)rB  =  B  ♦  Y(WrB)  is  then  obtained 
by  a  pair  of  matrix-matrix  multiplications: 

(i)  Z  =  WtB 

(ii)  B  =  B  *  YZ 

For  (i)  we  used  the  TIAX*  routine  pdot  that  can  compute  twenty 
parallel  dot  products.  To  initiate  the  parallel  dot  product  the  relevant 
twenty  vectors  must  be  placed  in  the  MAX  registers  using  another  MAX 
routine  called  ploadd.  We  examine  this  in  some  detail  so  that  an 
appreciation  of  MAX  board  computing  can  be  obtained.  Assume  that  W  and 
Y  are  m0-by-no  and  that  riQ  (for  simplicity)  is  a  multiple  of  twenty.  If  B  is 
m0-by-  k  then  here  is  how  the  matrix  Z  =  WTB  is  formed*. 


For  j  =  l:2Q:no 

LoadW(!:m0  .  j:j*19)  in  to  the  max  registers  using 

pload. 

For i  =  I* 

Compute  Z(j:j*l9.i:i)  =  W(l:m0  .  j:j*!9)  rB(l:m0.i:i) 

using  pdot. 

end  i 

end  ] 

The  times  required  for  each  pload  and  pdot  are  approximately 
ploadd:  L(m0)  =  23  ♦  58.2*m0  (jisec) 

D(m0)  =  29.7  ♦  ,738*m0  (ysec) 


Pdot: 


Thus,  Z  =  WrB  is  obtained  in  (rV20)(L(mo)  ♦  kD(m0)  )  jisec.  Since  Z  = 
requires  2monok  flops,  a  calculation  shows  that  the  effective 
performance  in  megaflops  is  approximately  given  by 

Mflop(WrB)  =  _ 55 _ 

1  ♦  40/m0  *  79/k  ♦  31/m0k 

This  expression  reveals  the  penalty  for  short  vectors  (small  m0)  and  for 
low  re-use  (small  k).  Here  is  a  table  of  some  representative  nfiop(wTB) 
values : 


k  =  100 

k  =  500 

k  =  1000 

k  =  5000 

o 

o 

•1 

© 

E 

25 

35 

37 

39 

mo  =  500 

29 

44 

47 

50 

m„  =  1000 

30 

46 

49 

52 

m0  =  2000 

30 

47 

50 

53 

Table  4.1 


we  mention  that  because  the  MAX  registers  can  handle  vectors  up  to 
length  2047.  the  subproblem  height  parameter  m0  should  be  chosen  so 
that  rowend(t.j)  -  rowsrt(t.j)  <  2047  for  all  t  and  j . 

we  now  turn  our  attention  to  the  rank-no  update  B «-  B  ♦  YZ  that  makes 
up  the  second  half  of  the  B  «-  0  *  WYT)rB  computation  For  this 
calculation  the  FFS-I64/MAX  has  a  parallel  saxpy  capability  that  appears 
well  suited.  With  two  MAX  boards  it  is  possible  to  perform  nine  saxpys 
of  the  form  Cj  «-  Cj  ♦  Sjy  in  parallel.  Note  that  this  is  a  rank-one  update: 

C  «-  C  ♦  ysT .  Here  is  how  the  update  of  B  would  proceedusing  the  parallel 
saxpy  routine  pvsma  and  the  attending  load/unload  routines  ploadv  and 
punldv  .  For  simplicity,  assume  that  k  is  a  multiple  of  9: 


For  j  =  1:9:k 

Use  ploadv  to  load  8(l:m0.j:j48)  into  the  max  registers. 

For  i  =  l:no 

Use  pvsma  to  perform  the  update 
B(1:m0,j:j*8)  «-  B(l:m0,j:j*8)  ♦  Y(l:m0,i:i)Z(i:i,j:j*8) 

end  i 

Use  punidv  to  write  the  updated  B(l:m0,j:j+8)  bade  to  memory. 

end  j 


Reasoning  as  we  did  to  determine  flflop(WrB),  it  can  be  shown  that 


Mflop(B*YZ)  = 


24 


♦  34/m0  ♦  70/ng  ♦  62/m0ng 


Note  that  the  re-use  factor  is  now  rig  rather  than  k  .  This  is  unfortunate 
since  in  our  application  we  typically  have  k  >  m0  »  rig  .  If  we  look  at 
some  typical  values  of  Mf lop(B  *  YZ),  then  this  is  what  we  find: 


no  =  20 

<? 

ii 

.b. 

o 

no=  60 

rig  =  80 

mQ  =  100 

4.9 

7.7 

9.5 

10.8 

m0  =  500 

5.2 

8.5 

10.7 

12.3 

m0  =  1000 

5.3 

8.6 

10.9 

12.6 

m0  =  2000 

5.3 

8.6 

H.O 

12.7 

Table  4.2 


Thus,  pvsma  is  ill-suited  for  the  B  «-  B  *  YZ  update  when  compared  to 
the  23-53  Mflop  rates  sustained  by  the  pdot  computation  of  Z  =  Wr8.  For 
this  reason  we  chose  to  use  a  new  FPS  parallel  matrix  multiply  routine 
called  pmmui  that  can  perform  the  update  B  «-  B  ♦  YZ  at  rates  more 

consistent  wth  *he  values  in  Table  4.1. 
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Two  final  comments  about  subproblem  solution.  The  first  concernsthe 
recording  of  the  orthogonal  matrix  Q.  This  matrix  is  the  product  of 
Householder  matrices.  Of  course,  these  Householders  are  clustered  and 
applied  in  WY  form  during  Algorithm  3.3.  But  we  can  save  all  the 
Householder  vectors  by  overwriting  each  zeroed  subcolumn  of  A  by  the 
corresponding  Householder  vector.  In  particular,  whenever  a  subcolumn  v 
6  Rd  of  A  is  zeroed  by  a  Householder  matrix  (I  ♦  2uuT/u^u)  .  we  store 
u(2:d)  in  v(2:d)  with  the  convention  u(l)  =  I  .  It  is  then  possible  to 
retrieve  Q  from  the  final  array  A  so  long  as  the  index  arrays  rowsrt. 
rowend,  colsrt,  and  colend  are  available. 

Lastly,  we  mention  that  the  subproblem  QR  factorizations  in 
Algorithm  3.3  are  typically  of  matrices  that  have  a  band  structureJndeed, 
it  is  usually  the  case  that  A(rowsrt(t,])*^owend(t,j).colsrt(j>colend(j)) 
has  lower  bandwidth  rowsrt(t-l,j)-rowsrt(t,j).  This  fact  is  exploited 
when  the  QR  factorization  is  computed  and  the  resulting  WY  factors 
found. 


Load  Balancing  and  Scheduling 


Suppose  Algorithm  3.3  is  to  be  implemented  on  array  processors 
AP,_APp .  At  time  step  t  in  Algorithm  3.3  there  are  q  independent  tasks 

to  perform.  Task  (t.j)  involves 


Factoring:  A(rowsrt(t,j):rowend(t.j),colsrt(j):colend(j)  )  =  QR 


Computing:  QTA(rowsrt(t,j):rowend(t,j),colsrt(j):n)) 


Here,  t  and  j  satisfy  t  <  t  <  tf  and!<j<q.lfp  =  q  then  an  immediate 

load  balancing  problem  arises  if  each  blc^x  column  has  the  same  width 
because  task  (t.j)  generally  has  more  matnx  to  update  than  task  (t.j^t). 
Oneway  around  this  difficulty  is  to  make  each  block  column  wider  than 
its  predecessor.  We  illustrate  this  for  the  case  q  =  2  with  block  column 
widths  n,  and  ri2 .  Assuming  a  subproblem  rtetght  of  m0  then  approximately 
2m0n12  ♦  2n1mon2  flops  are  required  for  task  (t,l).  On  the  other  hand. 
2m0r>22  flops  are  required  for  task  (t,2)  if  we  again  assume  a  subproblem 
height  of  m0.  These  two  flop  counts  are  approximately  equal  if  (n/nj)  * 
.62. 


* 

J» 


: 


I 


1 


*:• 

it,' 

'|V 


2 


I 

>  r 


& 


$ 

tv 


$ 

& 


K 


■ft 


ft 

ft 


$ 

K 


i 


I 


11 


For  general  q  it  is  possible  to  work  out  quotients  nj /nj^i  for  j  s 

i:q-i  so  that  approximate  load  balancing  results  for  the  column 
partitioning  n,,...,n  q.  Of  course,  in  practice  it  would  make  more  sense  to 

base  column  partitioning  guidelines  upon  benchmarks  rather  than  upon 
flop  counts.  We  have  not  pursued  this. 

instead  we  make  the  block  column  widths  narrow  enough  so  that  the 
number  of  independent  tasks  q  is  significantly  larger  than  the  number  p  of 
assigned  APs .  Approximate  load  balancing  is  then  achieved  by  assigning 
APk  to  block  columns  ]  =  k:p:q  .  For  example,  if  p  =  3  and  q  =  12.  then  ap, 

works  on  block  columns  I  .4,  7,  and  10,  AP2  is  assigned  to  block  columns 
2, 5,  8,  and  It,  while  AP3  is  applied  to  block  columns  3,  6, 9 ,  and  12.  In  a 
typical  time  step,  each  AP  will  work  on  4  subprooiems  with  a  greater 
balance  of  work  than  if  q s  3  .  This  style  of  distributing  tasks  has  been 
widely  used  in  parallel  matrix  factorization  work,  see  George,  Heath,  and 
Liu  (1985).  A  fringe  benefit  of  this  approach  is  that  we  can  choose  block 
column  widths  to  be  a  multiple  of  twenty.  This  allows  for  efficient 
exploitation  of  the  164/MAX  architecture  that  permits  twenty  parallel 
dot  products.  In  our  examples  we  used  uniform  block  column  widths  of 
twenty  and  thus  q  *  n/20 . 

To  actually  execute  Algorithm  3.3  in  parallel  on  LCAP-l  we 
implemented  a  lock-step  synchronization  scheme  using  "barriers'.  The 
blocking  arrays  rowend,  rowend,  colsrt,  and  colend  are  determined  by  the 
host  and  then  downloaded  into  the  p  array  processors  assigned  to  the 
computatioa  The  matrix  A  is  also  downloaded  into  the  shared  memory 
through  the  APs.  Af^  then  executes  the  following  program: 

Algorithm  4.1  (Processor  k's  Share  of  Algorithm  3.3) 

For  t  s  bt  f 
For  j  =  k:p.-q 

Compute  A(rowsrt(t,j):rowend(t,j),colsrt(]):colend(j))  *  QR 
Update  A(rowsrt(t,j):rowend(t,j).colsrt(j)n) 
end  j 
Barrier 
end  t 


When  the  barrier  is  encountered,  execution  is  suspended  until  all  the  other 
A P  programs  reach  their  barrier.  After  this  is  accomplished  the 
processing  of  the  next  time  step  begins. 

Further  details  about  the  LCAP-l  system  software  required  by  our 
implementation  may  be  found  in  Chin  and  Lorenzo(1986). 

5.  Some  Results  and  Conclusions 

In  testing  our  implementation  we  ran  our  codes  on  random  matrices  A 
e  Rmxn  with  the  property  that  A(l:m,l:n-I)e  =  A(l:m,n:n)  where  e  is  the 
vector  of  all  ones.  The  correctness  of  R  was  then  confirmed  by  checking 
the  equations  R(lxH,lrH)e  =  ROm-l/tn)  and  R(r\n)  =  0. 

We  report  on  two  of  the  several  examples  that  we  solved  using  the 
parallel  OR  code.  We  do  not  pretend  that  our  results  are  conclusive.  They 
merely  confirm  some  natural  suspicions  and  point  the  way  to  future 
research. 

The  first  example  indicates  that  we  can  get  away  with  our  lock  step, 
coarse  grained  approach  if  A  is  large  enough  and  suitably  blocked.  Here 
is  what  we  found  by  using  one,  two,  and  three  APs  to  solve  an  (m,n)  = 
(5000,1000)  problem  with  mo  =  1000,  q  =  50,  and  n,  =-=n50  =  20. 


Number  of 
Processors 


Time 

(seconds) 


Speed-Up 


Effective 

Mflop 


Table  5.1 


About  25*  of  the  elapsed  time  is  spent  on  transmitting  submatrices  to 
and  from  the  shared  memory.  To  see  roughly  where  this  percent  comes 
from  consider  the  update  B  «-  (l  ♦  WYT)T8  of  a  1000-by-500  submatrix  B 
in  shared  memory  where  W,Y  c  riooox2o  jf  this  update  is  performed  at  a 


* 

K 

! 

rate  of  30  Mf lops  then  approximately  1.3  seconds  must  be  devoted  to 
computatioa  To  transfer  B  to  or  from  shared  memory  requires  about  .14  I 

seconds.  Thus,  the  fraction  of  time  spent  on  communication  is 
approximately  .18  *  .28/1.58  . 

We  next  discuss  an  example  where  the  load  balancing  isn't  quite  so 
nice  resulting  in  a  degradation  of  performance.  In  the  example  m  =  5040, 
m0  =  1040,  n  =  500,  q  =  13,  and  n,  =  -  =  n^  s  40,  n^  =  20 .  Three  APs  were 
used  and  thus  block  column  tasks  are  assigned  as  follows: 

AP,  -  ( 1,4,7,10,13)  AP2  -  (  2,5,8,11)  AP3 «-  (  3,6,9,12) 

Because  only  five  steps  are  required  to  process  each  block  column,  there 
are  never  more  than  five  'active*  tasks  at  any  one  time  step.  This  makes 
load  balancing  a  little  problematical.  The  following  table  indicates  the 
time  (in  seconds)  that  each  AP  spends  computing  at  each  timestep. 


Time  Step 

AP, 

ap2 

AP3 

1 

3.52 

0.00 

0.00 

2  • 

3.64 

3.15 

0.00 

3 

3.64 

3.39 

2.82 

4 

3.64 

3.42 

3.16 

5 

5.85 

3.39 

3.16 

6 

3.10 

5.31 

4.83 

7 

2.70 

2.82 

4.83 

8 

2.70 

2.46 

2.58 

9 

3.88 

2.46 

2.24 

10 

2.05 

.  142 

2.24 

II 

1.76 

1.79 

3.01 

12 

1.76 

1.52 

1.54 

13 

1.99 

1.52 

1.30 

14 

.62 

1.52 

1.30 

15 

.43 

1.61 

1.30 

16 

.43 

0.00 

0.13 

17 

.43 

0.00 

0.00 

Table  5.! 


The  time  required  for  the  entire  computation  is  51.2  seconds,  the  sum  of 
the  maximum  times  in  each  row  of  the  taoie.  if  computation  was  equally 
shared  at  each  time  step  then  approximately  38.1  seconds  would  De 
required  for  the  complete  computatioa 

The  somewhat  inefficient  use  of  the  APs  highlighted  Py  the  second 
example  could  be  rectified  in  several  ways: 

1.  Choose  a  smaller  mo  .  This  would  have  the  effect  of  increasing  the 
number  of  tasks  to  be  shared  at  each  time  step. 

2.  Vary  the  block  column  widths  so  as  to  even  out  the  update  work. 

3.  Instead  of  letting  the  AP  that  generates  a  Q  be  entirely  responsible 
for  its  application,  share  the  update. 


We  have  not  fully  explored  these  possibilities.  Note  that  the  first  and 
third  suggestions  imply  smaller  matrix  multiplications  and  thereby 
reduced  164/hax  performance. 

A  more  promising  way  to  address  the  load  balancing  issue  would  be  to 
incorporate  a  dynamic  scheduling  of  tasks  as  is  discussed,  for  example,  in 
George,  Heath,  and  Liu  (1985)  and  Dongarra,  Sorenson,  and  Sameh  (1986). 
One  way  to  do  this  is  to  order  the  tasks  (t,j)  defined  in  $4  as  follows: 

(1.1)  .(l.2)„(1.q).(2.l).(2,2)„(Zq)„(t  f.0„(t  f,q) 

After  completing  a  task  each  AP  would  go  to  this  list  and  'grab'  the  next 
available  task  subject  to  rules  that  preserve  the  integrity  of  the  overall 
procedure,  we  will  report  on  this  elsewhere. 
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NONPARAMETRIC  ESTIMATION  FROM  QUEUES 
ARISING  IN  STAGGERED  ENTRY  CLINICAL  TRIALS 


Michael  J.  Phelan  and  N.U.  Prabhu 
Mathematical  Sciences  Institute.  Caldwell  Hall 
Cornell  University,  Ithaca.  NY  14853 


Abstract:  In  clinical  trials  with  staggered  entries  and  fixed  duration 
of  study,  patients  enter  at  random  epochs  and  are  put  on  test.  The 
objective  is  to  study  survival  times  from  a  principal  cause  A,  but 
factors  such  as  end  of  study  or  patient  withdrawal  make  it  impossible 
to  observe  the  survival  times  (censoring).  We  consider  two  different 
situations.  (1)  For  some  patients  death  may  actually  be  from  a  cause 
other  than  A,  say  B  (competing  risks).  It  is  desired  to  study  the 
survival  times  associated  with  both  causes  A  and  B.  (2)  A  certain 
number  m  (£  1)  treatments  are  available  and  each  entering  patient  is 
diagnosed  and  assigned  to  one  of  these  treatments.  The  objective  is  to 
study  the  survival  times  from  cause  A  under  these  treatments.  These 
problems  are  formulated  in  terms  of  queueing  models,  for  which  it  is 
desired  to  obtain  nonparametric  estimates  of  service  time 
distributions.  We  investigate  an  infinite  server  model  to  study  case 
(1)  and  an  m-statlon  model  for  case  (2).  The  input  in  both  models  is  a 
point  process.  We  observe  the  system  over  a  finite  time-interval 
[0,t],  with  t  fixed.  The  data  collected  consist  of  the  arrival 
epochs,  service  times  of  the  customers  who  arrive  during  [0,t],  with 
some  of  the  service  times  partially  observed,  and  (in  Model  2)  delays 
experienced  by  them  before  service.  Our  estimators  are  martingale 
estimators,  for  which  we  establish  consistency  and  weak  convergence  (as 
t  -»  •)  of  the  normalized  difference  to  a  Gaussian  process.  We  present 
the  results  for  Model  1.  Work  on  Model  2  is  in  progress. 


Keywords :  Clinical  trials,  censoring,  survival  times,  queues,  counting 
processes,  martingales,  product-limit  estimators. 


A 


The  estimation  problems  that  we  consider  arise  in  the  context  of 
clinical  trials  with  staggered  entries  and  fixed  duration  of  study. 
Patients  enter  the  study  at  random  epochs  and  are  put  on  test  (for 
example,  the  patient  is  treated  by  a  drug  therapy).  Typically,  the 
objective  is  to  study  a  patients'  time  of  death  (survlal  time)  from 
cause  A,  such  as  cancer  or  AIDS.  Factors  such  as  end  of  study  or 
patient  withdrawal  make  it  impossible  to  observe  patients'  time  of 
death  (censoring).  Ve  consider  two  different  situations. 

(1)  Entering  patients  are  put  on  test  immediately.  For  some 
patients,  death  may  actually  be  from  a  cause  other  than  A.  such  as 
toxicity.  Denoting  this  second  cause  as  B,  we  say  that  A  and  B  are 
competing  risks.  In  such  situations  it  may  be  important  to  study  the 
hazard  functions  associated  with  both  causes  A  and  B,  rather  than  A 
alone,  as  is  usually  done.  Thus  the  observed  survival  time  of  each 
patients  is  the  shorter  of  the  survival  times  from  A  and  B. 

(2)  On  some  occasions  there  are  a  certain  number  m  (£  1)  of 
treatments  available  and  each  entering  patient  is  diagnosed  immediately 
and  assigned  to  one  of  these  treatments  depending  on  factors  such  as 
the  patient's  background  end  state  of  health.  Furthermore,  limited 
availability  of  the  facilities  used  in  the  therapy  may  cause  delay 
between  the  patient’s  time  of  entry  and  the  actual  time  he  is  put  on 
test.  The  objective  is  to  study  the  survival  times  associated  with  the 
m  treatments . 

Situations  described  above  lead  to  the  following  queueing  models. 
Model  1. 

(1)  Let  Tq , Tj . Tg, • • .  denote  the  arrival  epochs  of  the  succes¬ 
sive  customers.  We  assume  that  the  point  process  r  =  (t^,  n  £  0} 

satisfies  the  following  conditions: 

(1)  tq  =  0,  (ii)  rn  <  »  (n  l  1).  and  (iii)  Tn+1  >  rn.  tr  T  ®- 

Let  N  =  {N^.t  £  0}  denote  the  counting  process  generated  by  the 

process  r.  Then  N  gives  the  number  of  arrivals  during  a 

l 

time-interval  (0,t].  Here  N  has  right-continuous  sample  paths  with 
left  limits,  jumps  of  size  1  and  Nq  =  0.  Thus  the  input  into  the 

queueing  system  is  described  equivalently  by  N  or  t. 

(2)  Each  customer  brings  two  demands  for  service.  Let  (X  ,Y  ) 

n  n 

denote  the  service  times  of  the  two  demands  of  the  nth  customer 
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(n  £  1).  We  assume  that  (X^.  n  £  1}  and  (Y^,  n  £  1}  are 

independent  sequences  of  mutually  independent  random  variables  with 
common  distributions  and  Fg.  respectively,  both  concentrated  on 

(0,*).  We  also  assume  that  these  service  times  are  independent  of  the 
input  process. 

(3)  There  are  an  Infinite  number  of  servers,  so  that  there  is  no 
waiting  line.  However,  each  server  meets  only  the  demand  that  needs 
the  shorter  service  time  for  the  customer  served. 

Our  objective  is  to  estimate  the  distributions  F^  and  F2  of 

the  service  times  of  the  two  demands.  For  this  purpose  we  observe  the 
system  over  a  time-interval  (0,t],  with  t  fixed.  The  data  consist 
of  the  arrival  epochs  and  the  service  times  actually  received  by  the 
first  customers,  but  some  of  these  service  times  may  only  be 

partially  observed;  namely,  those  customers  with  Tn  +  min(Xn>Yn)  >  t, 

for  whom  we  only  know  that  the  service  time  min(X  ,Y  )  >  t-r  . 

v  n  nJ  n 

(Xir  estimators  for  Fj  and  Fg  are  martingale  estimators.  The 

martingale  property  leads  to  proofs  of  their  consistency  and  the  weak 
convergence  (as  t  -*  “)  of  the  normalized  differences  to  a  Gaussian 
process.  These  results  are  stated  in  section  3.  Details  are  given 
elsewhere  (Phelan  and  Prabhu  (1987)).  We  make  only  a  mild  assumption 
concerning  the  rate  at  which  N£  goes  to  infinity.  Thus  we  avoid 

conditions  such  as  Nt/t  -»  constant  >0  in  some  sense,  as  is  often 

imposed  in  situations  Involving  random  sample  sizes.  This  shows  the 
advantage  of  our  approach  based  on  martingale  properties. 

MadfiO. 

(1)  Let  Tg.Tj.Tg....  denote  the  arrival  epochs  of  the 

successive  customers,  where  the  point  process  t  =  (t^,  n  £  0}  is  as 

in  Model  1.  We  associate  with  t  a  random  variable  Z  taking 

n  n  & 

values  in  E  =  (1,2 . m}  in  such  a  way  that  the  marks 

Z  =  (Zn,  n  ^  0)  may  depend  in  an  arbitrary  manner  on  the  process  t 

but  may  also  involve  some  other  randomization.  Thus  the  input  into 
this  queueing  system  is  the  marked  point  process  {(Tn*  Zn).  n  £  0} . 

(2)  There  are  m  stations  (1  £  m  <  <*>)•,  each  station  being  a 
finite-server  queueing  system.  The  service  times  of  customers  served 
at  the  ith  station  have  a  distribution  F^  concentrated  on  (0,°°). 

Here  the  distributions  F..F0 . F  are  all  distinct.  The  customer 

l  z  m 

arriving  at  the  epoch  t  has  a  service  time  X  having  distribution 

n  n 

I  F.  whenever  Z  =  i  (i  €  E,  n  2  0).  It  is  assumed  that  the  X  are 

i  1  n  n 

|  mutually  independent,  and  moreover,  they  are  conditionally  independent 

1  of  the  t  ,  given  the  Z  .  Thus  the  event  {Z  =  i}  indicates  that 

i  n  n  n 

I  the  customer  arriving  at  is  assigned  to  the  ith  station. 
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(3)  The  queue  discipline  at  each  station  is  first  come,  first 
served . 

Our  objective  for  this  model  is  to  estimate  the  distributions 

F-.F^ . F  of  the  service  times  at  the  m  stations.  The 

1  Z  m 

observation  scheme  is  exactly  as  in  Model  1,  but  in  addition  we  obtain 
for  each  arrival  a  record  of  any  delay  experienced  before  service,  and 
which  of  the  stations  serves  this  customer.  Work  on  this  model  is  in 
progress. 

In  the  standard  analysis  of  staggered  entry  clinical  trials  it  is 
assumed  that  the  epochs  of  entry  (En)  of  the  patients  are  mutually 

independent  random  variables  and  the  duration  of  study  is  fixed.  In 
order  to  develop  an  asymptotic  theory  it  is  then  assumed  that  the 
accrual  rate  of  patient  entry  increases  over  this  interval.  In  the 
terminology  of  this  paper,  {Tn}  are  the  order  statistics  generated  by 

(En),  so  that  the  input  of  patients  constitutes  a  special  type  of  a 

point  process.  This  input  model  is  described  by  Jennison  and  Turnbull 
(1985).  The  possibility  of  the  patient  accrual  rate  increasing 
indefinitely  over  a  fixed  time-interval  can  arise  in  some  situations 
(such  as  in  the  explosive  pure  birth  process).  However,  in  other 
situations  accruals  increase  by  virtue  of  the  increasing  length  of 
study,  which  is  our  approach  to  the  asymptotic  theory. 


2.  The  estimators  in  Model  1 

The  data  described  in  section  1  can  be  conveniently  summarized  as 
follows.  For  each  n  ^  1  define 

C*  =  max(0, t-r  ),  W*  =  min(X  ,Y  . C *) 
n  v  n'  n  v  n  n  ny 

6*  =  1(0*  l  min(X  ,Y  )).  q  =  1(X  £  Y  ) . 

n  vnt  vnny/  n  v  n  n' 

Here  q  =  1  iff  the  first  demand  of  the  nth  customer  is  met.  In  the 
n 

terminology  of  survival  analysis  min(Xn>Yn)  is  the  survival  time 

induced  by  two  competing  risks,  and  is  called  a  random 

right-cenoring  time  due  to  end  of  study  in  staggered  entry  clinical 

trials.  Thus  1-5  *  is  the  indicator  of  censoring  and  WC  is  the 
n  n 

observed  randomly  right-censored  survival  time.  In  the  present  context 

W*  is  the  observed  service  time,  6C  =  1  iff  the  nth  customer  has 
n  n 

arrived  and  completed  his  service  before  time  t,  and  5Cq  =  1 

n  n 

(5^(l-q  )  =  1)  iff  the  server  meets  this  customer’s  first  (second) 


demand,  so  that  his  service  time  is  X  (Yn).  Our  observation  scheme 
yields  the  data 

(2.1)  <K*5n-V-  n  =  1'2 . Nt} 

from  which  we  seek  to  estimate  the  distributions  and  F^-  The 

estimators  are  actually  based  on  the  statistics 

(2.2)  N(i.t)  =  (N  (i . t) . s  *  0}  (i  =1.2);  Y(t)  =  (Y  (t).  s  ^  0}. 

where 

Nt  t  t 

(2.2a)  N  (l.t)  =  2C  1(W;  1  s.  =  1) 

n=l 

Nt  t  t 

(2.2b)  N  (2.  t)  =  z  1(W*  *  s.  5^(1-T7n)  =  1) 

n=l 

Nt  t 

(2.3)  Y  (t)  =  2C  1(W;  *  s). 

n=l 

Here  N  (i.t)  is  the  number  of  customers  whose  service  times  are  less 

3 

than  s  among  those  who  arrive  and  complete  their  service  before  time 

t.  and  whose  ith  demand  is  met  (i  =  1.2).  Also,  Yg(t)  is  the  number 

of  arrivals  in  (O.tj  whose  service  times  (complete  or  partial) 
exceed  s . 

Our  estimation  procedure  yields  estimators  of  and  F2.  as 

well  as  their  associated  cumulative  conditional  rate  functions  b^  and 
bg  def ined  by 

(2.4)  b  (t)  =  /  [1  -  F  (s-)]'ldF  (s)  (i  =  1.2.  t  >  0). 

1  (O.t]  1  1 

These  estimators  are  provided  by  the  processes  Bt(i)  =  (Bg(i),  s  >  0} , 
and  Fj(t)  =  (Fg(i),  s  £  0}  (i  =  1,2)  defined  by 

(2.5)  B*(i)  =  2  Yu_1(t)ANu(i.t)  =  *  Yu_1( t)dNu( i . t)  (i  =  1.2) 

u^s  0 


(where  each  term  in  the  sum  is  interpreted  as  zero  if  both  factors  are 
zero),  and 


AB^(i)  =  B^(i)  -  B-_(i)  (i  s  1,2,  Oiuis). 

For  i  =  1,2  in  (2.5),  each  term  in  the  sum  is  the  proportion  of 
completed  service  times  of  ith  demand  equal  to  u  (i.e.  AN^fi.t)) 
among  those  service  times  (complete  or  partial)  which  exceed  u  (i.e. 

Yu ( t ) )  from  among  those  customers  who  arrive  during  the  time-interval 

(O.t].  Thus  B^(i)  estimates  the  cumulative  conditional  service  rate 

for  the  ith  demand.  The  expression  (2.6)  is  an  estimator  of  the 
product  integral  of  (2.4),  which  uniquely  determines  the  distributions 

Fj  and  Fg  from  and  b 2>  respectively.  The  estimators  F  are 

called  product-limit  estimators,  where  by  virtue  of  our  observation 
scheme  they  are  defined  from  a  random  number  of  partially  observed 
service  times  (cf.  Gill  (1980),  who  studies  this  type  of  estimator  from 
a  fixed  number  of  censored  survival  times). 


jtotic  prt 


We  state  the  asymptotic  properties  of  the  martingale  estimators 
(2.5)  and  (2.6).  We  begin  with  the  problem  of  consistency.  For  s  >  0 
define  y(s)  =  [1  -  Fj^s-Jjfl  -  F2(s-)]  and  0  =  sup{s=  y(s)  >  0}. 

Suppose  y(0)  =  0. 

Theorem  3 . 1  (Consistency).  Let  0  and  the  function  y  be  defined 

above,  and  suppose  that  for  s  €  [0,0),  (N  -N  )/N  -»  0  as  t  -*  ® 

t  t-s'  t  p 

Then  for  u  €  [0,0)  we  have,  as  t  -»  «, 


(3.1) 


(3.2) 


sup  |F*(i)  -  F  (s)|  -0  (i  =  1.2), 
s€[0,u]  1  p 


sup  |B*(i)  -  b  (s)  |  -*  0  (i  =  1,2). 
s€[0,u]  S  1  p 
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According  to  Theorem  3.1,  for  large  t,  Ft(i)  and  Bt(i)  are 
uniformly  close  estimates  of  F^  and  (i  =  1,2),  respectively,  on 

subintervals  in  the  intersection  of  the  support  of  F^  and  Fg-  It 

turns  out  that,  in  addition,  the  normalized  differences  in  (3.1)  and 

(3.2)  converge  weakly  to  Gaussian  processes.  For  this  purpose  let 
D(9)  denote  the  space  of  right-continuous  functions  defined  on  [0,0) 

and  having  left-limits.  Also,  let  Z*  =  (Z*.  s  €  [0,0))  and 

W*  =  (W*.  s  €  [0,0)}  (i  =  1,2)  denote  mean  zero  Gaussian  processes  of 
independent  increments  and  covariance  functions 

(3.3)  X  (1-Ab  (u))(y(u))_1db  (u).  s  €  [0.0)  for  i  =  j 

i  j  0 

<W\WJ>(s)  =  ■ 

0  for  i  *  j. 

and 

s  -2  11 

(3.4)  X  (1-Ab  (s))  *d<w\wl>(s).  s  €  [0.0)  for  i  =  j 
0  1 

<Zi,ZJ>(s)  =  - 

.  0  for  i  *  j . 

1  2  1 
Note  that  Z  is  independent  of  Z  and  that  W  is  independent  of 

W2.  We  have  the  following  theorem. 

Theorem  3.2  (Asymptotic  Normality).  Suppose  the  condition  of 
Theorem  3.1  holds.  Consider  the  normalized  processes 

(3.5)  uj  =  N^F^iHV/O-F^,  vj  =  rr^B^iJ-b.)  (i  =  1,2). 

Then  as  t  -»  « 


(3.6) 

(uj.uj)  5  (Z1 ,Z2) 

and 

(3.7) 

(Vf.vj)  5  (W1.*2) 

in  D(0)  x  D(0) 

endowed  with  the  Skorohod  topology.  □ 

595 

References 


Gill.  R.D.  (1980):  Censoring  and  Stochastic  Integrals.  Mathematical 
Centre  Tracts  124.  Mathematisch  Centrum,  Amsterdam. 

Jennison,  C.  and  Turnbull.  B.W.  (1985):  Repeated  confidence  intervals 
for  the  median  survival  time.  Biometrika  72  (3),  619-625. 

Phelan,  M.J.  and  Prabhu,  N.U.  (1987):  Estimation  from  an  infinite 
server  queueing  system  with  two  demands.  Mathematical  Sciences 
Institute.  Cornell  University,  Technical  Report  87-40. 


5°* 


.it  .*  & 


.l»  .M  .13 


.  jrcwwwm™. . 


A  Class  of  Diffusion-Type  Probability  Distributions 
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1.  The  Density  Function 

Associated  with  the  Markov  diffusion  equation 

± [cx2(x)z]xx  -  [<x,(x)z]^  -  Zt  =  0  ,  z  =  z(x,t) ,  (l.la) 

with  diffusion  and  drift  coefficients 

^oc2(x)  =  o<x2~0  , 

a,(x)  =  cx(2-0-p)x1"^  -  rx  ,  (i.lb) 

and  parameters  <x  >  0,  £  >  0,  p  <  l,  r  e  R,  is  the  class  of  source 
density  functions 

f(x)  =  fSb-y(HM)/2  C(P*H)/2  lq(2(U)f/2)exp-«S*  A  0  2) 


x  >  0.  £  =  xb"1  .  C  =  yb~ 1  exp  -rt0  .  q  =  - 1  ♦  (i-p)£‘ 


IqCD  = 


modified  Bessel  function  of  the  first  kind  of  order  q  .  From  a 

597 


.'l,'  I,  •-  1  ta  >  «  «  »  4«'4.4  «  ■  M  L.  I  *  .1  ».|  I,i  »,l  at  1'.  fcU  1'.  |Ui».  t‘.  i 


statistical  point  of  view,  the  parameters  in  (1.2)  are  b  >  0  scale, 


p  <  1  initial  shape,  3  >  0  terminal  shape,  and  y  *  0  source.  The 


restrictions  on  p  and  3  imply  q  >  -1. 


The  designation  of  y  as  a  source  parameter  is  based  on  the  fact 


Ml,  [21.  that  the  function  f(x)  given  in  (1.2)  has  been  derived  from  the 


delta  function  initial  condition  solution  (source  solution)  of  (1.1)  with 


the  delta  function  applied  at  t  =  0  and  y  >  0.  As  y  1  0,  f(x)  reduces  to 


the  well-known  hyper-Gamma  density  [21,  [31. 


2.  The  Likelihood  Function 


Application  of  the  source  density  (1.2)  in  statistical  practice 


requires  a  method  to  determine  the  numerical  values  of  the 


components  of  the  parameter  vector  P  =  (b,  q,  3,  z),  z  =  y  exp  -rt0. 


relative  to  given  statistical  data  (xv,  fv)  (v  =  1 . n)  which 


represent  observations  xv  together  with  their  relative  frequencies 


fv.  To  this  end,  the  likelihood  function  associated  with  (1.2)  will  be 


established. 


In  general  terms,  let  f(x;P),  x  >  0,  be  a  density  function  depending 


on  a  parameter  vector  P  .  Let  Xv  (v  =  ! . N)  be  random  sample 


values  of  a  random  variable  X  which  is  assumed  to  be  distributed 


according  to  f(x;P).  The  associated  likelihood  function  is 
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If  the  set  {xv}  contains  the  distinct  elements  xv  (v  -  I. ...  ni  N) 


with  relative  frequencies  fv  ,  the  log-likelihood  function  is 

n 

ftp)  *  n'1  log  L(P)  »  2  fv  log  f(x  ;  P). 

For  the  density  (1.2)  the  function  <{>(P)  takes  the  form 


Numerical  solution  attempts  on  the  equations  3<j>*/9q  =  0,  3<^*/3£ 
=  0,  3<|>* /3z  =  0  by  means  of  derivative-based  methods  have  not  been 
satisfactory.  Direct  optimization  techniques  are  under  investigation. 


Results  will  be  reported  upon  when  they  become  available. 
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COLUMN  MOVEMENT  MODEL  USED  TO  SUPPORT  AMM 


George  B.  McKinley 

U.S.  Army  Engineer  Waterways  Experiment  Station 
Geotechnical  Laboratory 
Vicksburg,  MS  39180-0631 


ABSTRACT.  For  many  years  mobility  maps  have  been  created  utilizing  the 
Army  Mobility  Model  (AMM).  These  maps  show  the  maximum  speed  which  a  vehicle 
can  attain  in  off-road  terrain  and  on-road  networks.  These  maps  are  useful  in 
comparing  the  performance  of  vehicles  or  as  an  aid  in  route  selection.  Three 
additional  computer  models  have  been  developed  to  support  the  AMM  by 
predicting  the  performance  of  military  vehicles  over  digital  terrain  units 
along  a  specified  route.  These  three  models  are  an  Acceleration  Model,  a 
Traverse  Model,  and  a  Column  Movement  Model.  The  Acceleration  Model  produces 
a  time  versus  speed  curve  for  a  vehicle  along  a  specified  route  across  a 
terrain  unit.  The  Traverse  Model  uses  the  Acceleration  Model  as  a  building 
block  and  predicts  a  vehicle's  performance  along  a  specified  route  over  a 
series  of  terrain  units.  The  Column  Movement  Model  uses  the  Traverse  Model  as 
a  building  block  for  predicting  performance  of  a  column  of  vehicles  along  a 
specified  route.  The  Column  Movement  Model  maintains  vehicle  spacing  within 
the  column  in  accordance  with  military  doctrine. 

I.  TERRAIN  DATA  SELECTION.  The  terrain  data  required  by  the  models  may 
be  acquired  using  one  of  three  basic  methods.  The  first  method  consists  of 
surveying  a  traverse  to  determine  slopes  and  curvatures.  The  courses  are 
concurrently  subdivided  into  a  number  of  segments  (terrain  units),  each  of 
which  should  be  nominally  uniform  with  respect  to  values  pertinent  to  mobility 
including  surface  roughness  (rms  elevation),  slope,  driver  recognition 
distance,  radius  of  curvature,  soil  type,  and  soil  strength.  From  these 
measurements  a  digital  terrain  data  base  is  developed  for  use  with  AMM 
(Nuttall,  Green,  Dean,  and  Gray  1985). 

A  second  method  consists  of  using  the  Waterways  Experiment  Station's 
(WES)  Digital  Road  Net  Data  bases  which  exist  for  a  selected  few  1:50,000 
scale  map  sheets  in  the  Federal  Republic  of  Germany.  Software  has  been 
developed  at  WES  to  select  "best  paths"  on  this  network  based  on  either  time 
or  distance.  This  path  selection  is  accomplished  by  use  of  a  blind 
bidirectional  search. 

The  third  method  involves  the  manual  selection  of  a  path  through  an  areal 
map.  This  selection  may  consist  of  visually  analyzing  a  speed  prediction  map 
and  choosing  sufficient  Universal  Transverse  Mercator  (UTM)  coordinates  to 
define  the  desired  path.  Software  developed  at  WES  will  then  create  the 
proper  terrain  file  by  either  assuming  linear  movement  between  the  specified 
UTMs  or  by  selecting  the  "best  path"  by  use  of  the  blind  bidirectional  search. 

II,  AMM.  The  AMM  is  a  comprehensive  analytical  model  designed  to 
evaluate  objectively  the  on-  and  off-road  mobility  of  vehicles  by  means  of 
digital  computer  simulation  (Nuttall,  Dugoff,  and  Rula  197^).  First  developed 
in  1971,  the  AMM  is  the  Waterways  Experiment  Station's  living  mobility  model 
and  is  modified  as  required,  based  on  improved  mobility  algorithms  and 
customer  needs.  The  AMM  is  organized  as  illustrated  by  the  general  flow 
diagram  in  Figure  1. 
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For  its  data  base,  the  AMM  requires  quantitative  input  descriptions  of 
terrain,  vehicle,  and  driver  attributes  as  shown  in  Table  1.  In  somewhat  more 
detail,  Table  2  describes  how  terrain  data  are  portrayed  in  the  AMM.  Driver 
attributes  in  AMM  characterize  the  driver  according  to  his  ability  to  perceive 
and  react  to  visual  stimuli  affecting  his  behavior  as  a  vehicle  controller  and 
his  limiting  tolerances  to  shock  and  vibration.  The  influence  on  vehicle 
speed  of  these  latter  driver  attributes  is  taken  into  account  by  the  vehicle 
ride  dynamics  module  of  the  AMM  (Figure  1). 

In  following  the  general  flow  diagram  in  Figure  1 ,  raw  input  driver  and 
terrain  data  first  are  adjusted  to  account  for  the  influence  of  appropriate 
"scenario"  factors,  such  as  season  and  weather.  Terrain  data  in  the  AMM  are 
used  to  describe  small  patches  or  segments,  each  one  of  which  is  defined  by  a 
set  of  values  of  terrain  factor  classes  (Table  2)  that  is  different  in  at 
least  one  terrain  factor  class  value  from  the  sets  of  class  values  of  all 
contiguous  patches. 


As  shown  in  Figure  1,  input  vehicle,  driver,  and  terrain  data  are 
modified  by  the  vehicle  data  preprocessor,  the  vehicle  ride  dynamics  module, 
and  the  terrain  data  preprocessor.  The  vehicle  data  preprocessor  is  a  part  of 
the  main  program  of  AMM,  and  is  used  once  at  the  beginning  of  an  AMM  run  to 
compute  vehicle  power  train  and  soil-running  gear  characteristics  that  are 
repeatedly  used  in  making  subsequent  vehicle  mobility  predictions  for 
individual  areal  patches  or  road  segments. 

The  ride  dynamics  module  operates  on  a  stand-alone  basis,  and  in  effect 
serves  as  a  major  preprocessor  of  input  vehicle,  driver,  and  terrain  data 
(Murphy  and  Ahlvin  1976).  Data  similar  to  the  output  of  the  ride  dynamics 
module  may  also  be  obtained  by  field  testing  of  vehicles  on  ride  dynamics  and 
obstacle  test  courses.  From  input  vehicle,  driver,  and  obstacle  height  data, 
the  ride  dynamics  module  computes  vehicle  speed  values  at  which  a  vertical 
acceleration  of  2.5-g's  is  experienced  at  the  driver's  station.  The  ride 
dynamics  module  also  computes,  as  a  function  of  surface  microroughness 
(expressed  as  the  root-mean  square  elevation  (rm3)  of  the  effective  profile), 
speed  values  corresponding  to  limits  of  driver  tolerance  to  random 
vibrations.  This  tolerance  is  defined  in  terms  of  the  vibrational  power 
absorbed  by  a  person  at  a  specific  location  in  the  vehicle,  often  taken  as  a 
constant  tolerance  limit  of  6-watts  (Lins  1972).  Currently,  data 
preprocessing  in  the  ride  dynamics  module  reduces  dynamics-based  predictions 
in  areal  patches  and  road  segments  to  a  rapid  table  lookup  process. 

The  terrain  data  preprocessor  converts  the  ranges  of  values  of  terrain 
factor  classes  stored  in  the  terrain  data  base  to  the  engineering  values  used 
by  subsequent  AMM  modules.  Ordinarily,  the  value  assigned  for  a  given  terrain 
or  road  factor  is  the  best-estimate  value  of  that  factor's  class  range.  This 
preprocessor  also  accounts  for  "scenario  factors"  by  adjusting  or  selecting 
among  stored  terrain  or  road  factor  values  to  reflect  the  influence  of 
variations  in  season,  weather,  day  or  night  operation,  and  other  factors. 

Continuing  in  Figure  1,  data  acted  upon  by  the  three  preprocessors  next 
are  used  in  the  vehicle  performance  prediction  modules  that  are  the  heart  of 
the  AMM — the  areal  patch  module  and  the  on-road  segment  module.  The  general 
flow  of  the  areal  patch  module  is  shown  in  Figure  2.  Input  to  this  module 
from  the  vehicle  data  preprocessor  includes  the  relation  between  vehicle  speed 
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and  tractive  force  for  the  vehicle  on  a  smooth,  level,  firm  surface,  and  the 
minimum  soil  strength  that  the  vehicle  requires  to  maintain  headway  on  level, 
weak  soils.  Using  these  data  and  appropriate  data  from  the  terrain  data 
preprocessor,  the  areal  patch  module  checks  for  vehicle-obstacle  interferences 
(hangups),  determines  the  total  tractive  force  required  to  overcome  terrain 
impediments,  and  computes  vehicle  speed  limited  by  the  total  motion-resisting 
forces.  This  calculation  involves  interaction  of  the  3oils,  slope,  obstacle 
traction,  obstacle  override,  and  vegetation  impact  and  override  submodels. 

NOGO  is  called  when  vehicle  hangup  is  predicted  and  when  vehicle  traction  and 
override  force  are  computed  to  be  insufficient  to  overcome  resistances  to 
motion. 

Next,  the  areal  patch  module  selects  the  minimum  speed  among  speed 
limited  by  the  resisting  forces;  ride-limited  speed  (obtained  from  the  vehicle 
ride  dynamics  module  output  data  array);  and  visibility-limited  speed  (from 
the  visibility  submodel).  This  speed  is  then  modified  to  account  for 
acceleration  and  deceleration  between  discrete  obstacles  and  maneuvering  to 
avoid  vegetation  and  other  obstacles.  This  procedure  produces  a  maximum 
vehicle  speed  predicted  for  a  particular  terrain  unit  and  a  particular 
vegetation  override/avoidance  option.  The  procedure  is  repeated  for  the 
vehicle  operating  up  slope,  down  slope,  and  across  3lope,  producing  three 
speed  predictions. 

Compared  to  areal  (off-road)  terrain,  on-road  terrain  includes 
considerably  fewer  factors  that  affect  vehicle  performance  .  Still,  the  on¬ 
road  module  (the  third  AMM  performance  prediction  module)  ha3  a  computational 
structure  similar  to  that  of  the  areal  module.  For  the  particular  road 
surface  material  of  interest,  values  of  tractive  and  rolling  resistance 
coefficients  are  obtained  for  the  given  wheeled  or  tracked  vehicle  operating 
straightline  and  level  at  maximum  speed.  Separate  speeds  are  then  computed  as 
limited  by  available  traction  and  countervailing  resistances  (rolling,  grade, 
and  curvature);  ride  dynamics  (absorbed  power);  visibility  and  braking;  tire 
load,  inflation,  and  construction;  and  road  curvature.  The  least  of  these 
five  speeds  is  assigned  as  the  maximum  for  the  on-road  segment  scrutinized. 
Scenario  options  and  combinatorial  procedures  to  predict  vehicle  speed  are 
exercised  in  the  on-road  module  similarly  to  the  method  previously  described 
for  the  areal  patch  module. 

III.  ACCELERATION  MODEL.  The  Acceleration.  Model  predicts  speed  versus 
time  relationships  for  a  vehicle  accelerating  on  a  defined  surface  (road  or 
areal).  The  vehicle's  acceleration  is  modeled  using  tractive  force  versus 
3peed  data  obtained  from  the  AMM  which  has  been  modified  to  account  for 
slippage  of  the  vehicle's  running  gear  in  the  soil.  The  vehicle  accelerates 
using  the  amount  of  tractive  force  available  beyond  that  which  is  required  to 
overcome  the  sum  of  the  resisting  forces  (usually  only  motion  resistance, 
since  acceleration  tests  are  normally  run  on  terrain  with  no  slope  or 
vegetation).  The  time  and  distance  for  acceleration  are  calculated  for  each 
segment  of  the  curve.  Two  different  methods  are  used  to  calculate  these 
values.  If  acceleration  is  known  to  not  be  constant  (i.e.,  the  forces  that 
serve  as  endpoints  of  the  current  line  segment  are  unequal)  then  acceleration 
is  modeled  as  if  it  were  linear  between  the  two  speeds.  This  is  accomplished 
in  two  steps.  First  the  time  to  accelerate  between  the  two  speeds  is 
calculated  as  follows: 


Is 


DELTAT  =  VMR/A  *  LOG(  (A*VX-*-B)/(A*V1+B) ) 


where 

DELTAT 

VMR 

A 

VI 

vx 

B 


=  time  necessary  to  complete  acceleration  step  (seconds) 
=  vehicle's  mass  modified  for  inertia  (slugs) 

=  slope  of  current  line  segment  of  curve 
=  velocity  at  start  of  acceleration  step  (ft/s) 

=  velocity  at  end  of  acceleration  step  (ft/s) 

=  y  intercept  of  current  line  segment  of  curve 


Once  the  time  has  been  calculated  the  distance  which  will  be  covered  during 
the  acceleration  step  may  be  calculated  as  follows: 

DELTAX  =  VMR  »  ( A* V 1 +B ) / ( A# *2 ) * ( EXP ( A / VMR* DELT AT ) - 1 . ) -B*DELTAT /A ) 

where 

DELTAX  =  distance  covered  during  acceleration  step  (ft) 

VMR  =  vehicle's  mass  modified  for  inertia  (slugs) 

A  =  slope  of  current  line  segment  of  curve 

VI  =  velocity  at  start  of  acceleration  step  (ft/3) 

B  =  y  intercept  of  current  line  segment  of  curve 

DELTAT  =  time  necessary  to  complete  acceleration  step  (seconds) 

In  the  case  of  constant  acceleration,  the  acceleration  is  calculated  using 
F=MA  and  time  and  distance  are  calculated  using  the  equations  of  motion  with 
constant  acceleration.  In  both  cases  the  vehicle's  mass  is  modified  by  a 
factor  which  simulates  the  inertial  mass  of  the  rotating  parts  which  must  be 
accelerated  when  the  entire  vehicle  is  accelerated.  The  vehicle's  mass  i3 
modified  as  follows: 


VMR  =  VM  *  ( RMF1+RMF2*(FAVG*#2) ) 


where 

VMR  =  vehicle's  mass  modified  for  inertia  (slugs) 

VM  =  vehicle's  unmodified  mass  (slugs) 

RMF1  =  1.14  if  there  is  a  tracked  assembly  on  the  vehicle 
=  1.06  otherwise 

RMF2  =  _0.002*(IDIESL»CID)«»1 .68  /(NCYL»GCW)»TCOR  •  _  RR/ETA/QMAX  ”2 


where 

IDIESL  =  3  if  engine  is  turbine 

=  2  if  engine  is  a  two  cycle  diesel 
=  1  otherwise 

CID  =  engine  displacement  in  cubic  feet  (rated  horsepower  for 
turbine) 

NCYL  =  number  of  cylinders  (1  is  used  for  turbine) 

GCW  =  gross  combined  weight  of  vehicle 

RR  =  rolling  radius  (ft)  if  wheeled  assembly  or 

sprocket  pitch  radius  (ft)  if  it  includes  a  tracked  assembly 
ETA  =  0.7  if  there  is  a  tracked  assembly  present 
=  0.9  otherwise 

QMAX  =  maximum  engine  torque  (ft-lbs) 

FAVG  =  tractive  effort  at  center  of  acceleration  step  (ft. -lbs.) 

TCOR  =  0.125  if  engine  is  a  turbine 

=  1  otherwise 
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These  calculations  are  performed  for  each  segment  of  the  tractive  force  versus 
speed  curve  until  the  vehicle's  maximum  predicted  speed  is  attained.  Example 
outputs  from  the  Acceleration  Model  are  shown  in  Figures  3  through  5. 

IV .  TRAVERSE  MODEL .  The  WES  Traverse  Model  predicts  the  time  required 
by  a  defined  vehicle  to  cross  a  series  of  terrain  units  (AMM  road  or  areal 
format).  The  vehicle  is  first  run  over  the  digital  terrain  using  the  AMM, 
thus  computing  all  the  values  necessary  for  predicting  the  vehicle's 
performance  over  each  terrain  unit. 

The  traverse  begins  with  the  vehicle  at  the  start  of  the  first  terrain 
unit  at  zero  velocity.  When  the  vehicle  first  accelerates  and  upon  entering 
any  other  terrain  unit,  the  model  finds  the  corresponding  tractive  force  for 
the  vehicle's  current  velocity.  If  this  tractive  force  is  found  equal  to  the 
total  of  the  resisting  forces  in  the. current  terrain  unit  then  the  vehicle 
will  not  accelerate.  If  the  vehicle  is  found  to  accelerate,  then  the  time  and 
distance  for  acceleration  are  calculated  using  the  same  algorithms  utilized  by 
the  Acceleration  Model. 

Each  terrain  unit  has  two  speeds  associated  with  it.  One  speed  is  the 
predicted  speed,  which  is  the  maximum  speed  which  may  be  reached  by 
acceleration  from  a  lower  speed  in  that  terrain  unit.  The  other  speed  is  the 
maximum  speed  at  which  a  vehicle  may  enter  the  terrain  unit.  The  limit  is  the 
lowest  speed  chosen  by  the  AMM  from  among  the  ride,  visibility,  and  curvature 
(when  on-road)  limited  speeds.  The  only  stipulation  for  a  vehicle's  entering 
speed  is  that  it  be  less  than  or  equal  to  the  limiting  3peed.  In  the  case  of 
a  soil-strength  limited  terrain  unit  a  vehicle  is  allowed  to  enter  at  a  higher 
speed  than  that  predicted  maximum,  but  the  3peed  must  still  be  less  than  or 
equal  to  the  limited  speed.  In  this  situation  the  vehicle's  deceleration  will 
be  modeled  by  moving  backwards  along  the  tractive  force  versus  speed  curve. 

The  vehicle's  speed  at  the  end  of  each  acceleration  3tep  is  compared  to 
the  limited  speed  of  the  next  terrain  unit.  When  the  vehicle's  speed  becomes 
greater  than  that  limit,  the  distance  required  to  brake  from  the  current  speed 
to  that  limit  is  computed.  This  braking  is  modeled  by  allowing  the 
application  of  the  maximum  braking  force  available  for  that  vehicle  on  the 
current  terrain.  The  equation  F  =  MA  is  used  to  compute  this  constant 
deceleration.  If  the  sum  of  the  distance  used  for  acceleration  and  that 
required  for  braking  becomes  greater  than  the  length  of  the  current  terrain 
unit,  then  the  intersection  of  the  current  acceleration  step  and  the  braking 
line  is  computed.  From  the  time  and  distance  used  for  both  acceleration  and 
braking,  an  average  velocity  for  the  terrain  unit  can  be  calculated.  If  the 
vehicle  reaches  the  predicted  speed  for  the  terrain  unit  then  the  time  and 
distance  at  that  speed  will  also  be  used  to  calculate  the  average  speed.  If 
the  application  of  brakes  were  ever  necessary  over  an  entire  terrain  unit  plus 
portions  of  a  previous  unit,  the  model  would  revert  back  to  that  previous 

terrain  unit  and  take  proper  action  to  correct  the  exiting  speed  of  that  unit 

to  allow  for  proper  braking  in  the  current  unit. 

The  exiting  speed  of  a  terrain  unit  is  used  as  the  entering  speed  for  the 
following  terrain  unit.  The  vehicle's  time  in  each  terrain  unit  and  the 

length  of  the  unit  are  used  to  compute  an  average  speed  for  that  unit  along 

with  an  average  speed  for  the  distance  up  to  and  including  that  unit. 
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V.  COLUMN  MOVEMENT  MODEL.  The  WES  Column  Movement  Model  computes  the 
total  time  required  for  a  selected  group  of  vehicles  to  traverse  a  series  of 
terrain  units.  The  vehicles,  which  constitute  the  column  must  follow  one  of 
three  sets  of  basic  march  orders.  The  first  column  type  is  an  infiltration  in 
which  each  vehicle  moves  at  its  best  speed  over  the  entire  route.  Vehicles 
travel  together  in  formation,  but  vehicles  are  allowed  to  pass  each  other  when 
possible.  The  vehicles  leave  the  staging  area  at  random  intervals  of  1  to 
10  minutes  in  duration. 

The  second  column  to  be  modeled  is  the  open  column.  In  the  open  column 
vehicles  travel  in  single  file  over  the  entire  route.  The  vehicles  must 
maintain  a  spacing  of  between  50  and  100  meters.  Each  vehicle  will  start 
20  seconds  after  the  previous  vehicle  if  this  will  allow  for  proper  vehicle 
spacing  to  be  maintained.  If  the  previous  vehicle  has  reached  the  maximum 
spacing  limit  in  less  than  20  seconds  then  the  next  vehicle  is  allowed  to 
start. 


The  third  type  of  column  to  be  modeled  is  the  closed  column.  The  closed 
column  is  identical  to  the  open  column,  except  that  the  vehicle  spacing  must 
remain  between  10  and  50  meters.  The  optimum  start  interval  is  changed  to 
9  seconds  for  the  closed  column  and  is  used  as  in  the  open  column. 

Each  vehicle's  acceleration  and  braking  are  modeled  in  a  manner  similar 
to  the  acceleration  and  braking  modeled  in  the  traverse  model.  The  major 
difference  is  that  each  vehicle's  progress  is  monitored  at  a  user  specified 
time  interval.  A  small  interval  (5  seconds  or  less)  i3  preferred,  3ince  it 
should  yield  more  accurate  modeling  of  vehicle  interaction. 

Each  time  interval  may  be  evaluated  twice.  First  each  Vehicle  will 
traverse  the  terrain,  obeying  terrain  speed  limits,  until  the  time  interval  is 
over.  Each  vehicle's  entering  speed  for  each  terrain  unit  and  time  spent  in 
each  terrain  unit  are  saved  for  possible  later  modification. 

Next  the  position  of  each  vehicle  is  checked  to  insure  that  the  column's 
unity  is  maintained.  If  distances  between  vehicles  are  too  large  or  too 
small,  certain  vehicles  are  required  to  proceed  at  a  slower  pace  over  the  time 
frame . 


VI.  CONCLUSIONS.  The  above  describes  three  programs  serving  as 
extensions  to  the  AMM,  which  predict  vehicle  mobility  when  a  path  consisting 
of  known  terrain  units  is  specified  as  input.  The  Acceleration  Model  predicts 
time,  speed,  and  distance  relationships  for  a  vehicle  over  a  single  terrain 
unit.  The  Traverse  Model  accurately  predicts  vehicle  performance  over  a 
series  of  terrain  units.  The  Column  Movement  Model  adequately  represents  the 
movement  of  groups  of  vehicles  over  a  sequence  of  road  and  areal  terrain  units 
in  which  their  interaction  with  both  the  terrain  and  each  other's  position  are 
modeled.  Future  plans  include  applying  the  methodology  utilized  by  the  Column 
Movement  Model  to  additional  unit  formations  of  varying  size  and  composition. 
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Figure  3.  Sample  speed  versus  time  plot  for  M2 
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Figure  4.  Sample  speed  versus  distance  plot  for  M2 


611 


vv 


'  X  ".x  njf  nji  rviry*  r^7^rw»vwj* 


& 

r* 


*3 


i 


1 


M113 

M113 

M60A3 

Ml  1 3 

M60A3 

M60A3 

TIME 

MILES 

MILES 

MILES 

MILES 

MILES 

MILES 

0 

0 

45 

0.220 

0.192 

0.164 

0.157 

0.136 

0.129 

0 

1 

30 

0.447 

0.437 

0.409 

0.402 

0.378 

0.349 

0 

2 

15 

0.707 

0.681 

0.674 

0.655 

0.627 

0.621 

0 

3 

0 

0.920 

0.906 

0.897 

0.890 

0.872 

0.865 

0 

3 

45 

1.167 

1.158 

1.130 

1.123 

1.106 

1.098 

0 

4 

30 

1.382 

1.354 

1.326 

1.313 

1.285 

1.278 

0 

5 

15 

1.503 

1.493 

1.466 

1.454 

1.427 

1.414 

0 

6 

*  0 

1.646 

1.618 

1.590 

1.575 

1.547 

1.527 

0 

6 

45 

1.846 

1.821 

1.799 

1.776 

1.764 

1.746 

0 

7 

30 

2.057 

2.028 

2.000 

1.979 

1.957 

1.941 

0 

8 

15 

2.234 

2.216 

2.189 

2.182 

2.163 

2.147 

0 

9 

0 

2.350 

2.333 

2.320 

2.302 

2.293 

2.269 

0 

9 

45 

2.534 

2.506 

2.489 

2.466 

2.438 

2.415 

0 

10 

30 

2.646 

2.618 

2.590 

2.562 

2.534 

2.506 

0 

11 

15 

2.860 

2.832 

2.805 

2.783 

2.776 

2.750 

0 

12 

0 

3.067 

3.039 

3.011 

2.983 

2.977 

2.964 

0 

12 

45 

3.307 

3.286 

3.260 

3.232 

3.205 

3.199 

0 

13 

30 

3.579 

3.550 

3.525 

3.497 

3.469 

3.458 

0 

14 

15 

3.818 

3.790 

3.767 

3.756 

3.728 

3.717 

0 

15 

0 

4.028 

4.004 

3.980 

3.960 

3.932 

3.904 

0 

15 

45 

4.241 

4.228 

4.205 

4.182 

4.154 

4.126 

0 

16 

30 

4.443 

4.415 

4.388 

4.369 

4.341 

4.318 

0 

17 

15 

4.659 

4.637 

4.610 

4  .'598 

4.575 

4.554 

0 

18 

0 

4.835 

4.807 

4.779 

-  4.765 

4.741 

4.723 

0 

18 

45 

5.036 

5.008 

4.986 

4.964 

4.936 

4.914 

0 

19 

30 

5.233 

5.215 

5.202 

5.179 

5.151 

5.122 

0 

20 

15 

5.473 

5.445 

5.439 

5.429 

5.401 

5.373 

0 

21 

0 

5.666 

5.638 

5.610 

5.597 

5.569 

5.548 

0 

21 

45 

5.867 

5.839 

5.811 

5.784 

5.756 

5.734 

0 

22 

30 

6.069 

6.041 

6.013 

6.006 

5.985 

5.965 

0 

23 

10 

6.094 

6.094 

6.094 

6.094 

6.094 

6.094 

VEHICLE 

TIME  1 

ro 

Mil  3 

0 

22 

36 

M113 

0 

22 

42 

M60A3 

0 

22 

51 

Mil  3 

0 

22 

52 

M60A3 

0 

23 

1 

M60A3 

0 

23 

6 

Figure  6.  Sample  output  from  the  WES  Column  Movement  Model 
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Table  1 

Terrain,  Vehicle,  and  Driver  Attributes 
Characterized  in  the  AMM  Data  Base 


_ Terrain _ 

Surface  composition 
Type 

Strength 

Surface  geometry 
Slope 

Discrete  obstacles 
Roughness 

Vegetation 

Stem  size  and  spacing 
Visibility 

Linear  geometry 

Stream  cross  section 
Water  velocity  and  depth 


_ Vehicle _ 

Geometric  characteristics 
Inertial  characteristics 
Mechanical  characteristics 


_ Driver _ 

Reaction  time 
Recognition  distance 
Vertical  acceleration  limit 
Horizontal  acceleration  limit 
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Table  2 

Terrain  Data  Required  for  AMM 


Terrain  or  Road  Factor 


Description* 

Off-Road 


1.  Surface  material 


Ranee 


No .  of 
Factor 
Classes 


a.  Type 

USCS/other 

NA 

4 

b.  Mass  strength 

Cl  or  RCI 

0  to  >280 

11 

c.  Wetness 

NA 

NA 

4 

2. 

Slope 

Percent 

0  to  >70 

8 

3. 

Obstacle 

a.  Approach  angle 

Degrees 

90  to  270 

14 

b.  Vertical  magnitude 

cm 

0  to  >85 

7 

c.  Length 

m 

0  to  >150 

7 

d.  Width 

cm 

0  to  >120 

5 

e.  Spacing 

m 

0  to  >60 

8 

f.  Spacing  type 

NA 

NA 

2 

4. 

Surface  roughness  (x  10) 

rms ,  cm 

0  to  >7.5 

9 

5. 

Stem  diameter 

cm 

0  to  >25 

8 

6. 

Stem  spacing 

m 

0  to  >20 

8 

7. 

Visibility 

m 

0  to  >50 

9 

8. 

Left  approach  angle  (LA) 

Degrees 

90  to  270 

20 

9. 

Right  approach  angle  (RA) 

Degrees 

90  to  270 

20 

10. 

Differential  bank  height  or 

differential  vertical 

magnitude  (A) 

m 

0  to  >4 

9 

11. 

Base  width  or  top  width 

m 

0  to  >70 

21 

12. 

Low  bank  height  or  least 

vertical  magnitude  (LBH) 

m 

0  to  >6 

8 

(Continued) 

* 

AMM  can  accept  terrain  data  in 

either  inch-pound  or 

metric  units 

of  mea- 

surement.  Data  preprocessors  in 

AMM  convert  values 

of  all  input 

data  to  l 

inch-pound  system  before  calculations  involving  these  data  occur. 
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Table  2  (Concluded) 


Terrain  or  Road  Factor 


Description* 


13.  Water  depth  (D) 

14.  Water  velocity 


Off-Road  (Continued) 
m 


mps 

On-Road 


15.  Surface  material 


a .  Type 

b.  Surface  strength 


NA 

Cl  or  RCI 


16. 

17. 

18. 
19. 


Slope 

Surface  roughness  (x  10) 

Curvature 

Visibility 


Percent 
rms ,  cm 
Degrees 


m 
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Range 


0  to  >5 
0  to  >3.5 


NA 

0  to  >280 


0  to  50 
0  to  4 
<140  to  180 
0  to  91.4 


No.  of 
Factor 
Classes 


4 

11 


8 

9 

9 

9 


ft 


INFLUENCE  OF  REFLECTED  SHOCK  WAVES  ON  A  HYPERSONIC  SHAPED  CHARGE  JET 


H.W.  Meyer 
J.E.  Danberg 

Ballistic  Research  Laboratory 
Aberdeen  Proving  Ground,  Md.  21005 


Abstract 

Considerable  research  has  been  devoted  to  shaped  charge  jet  formation  and 
penetration  but  little  work  has  been  reported  on  the  aerodynamic  forces  on  the 
jet  particles,  particularly  the  interference  caused  by  near  by  surfaces.  The 
object  here  was  to  develop  numerical  methods  and  apply  them  to  this  problem.  A 
Godunov  inviscid  technique  has  been  used  along  with  high  temperature  thermody¬ 
namic  properties  to  obtain  the  flow  field  in  front  of  a  hemisphere.  This  was 
used  as  the  initial  condition  for  the  computation  of  the  flow  field  in  the 
annular  region  between  the  jet  and  a  surrounding  cylindrical  tube.  Computations 
were  done  at  Mach  number  4  and  compared  to  experimental  data.  The  jet  problem 
(Mach  number  20.45)  was  then  solved. 


I . INTRODUCTION 

The  objective  of  this  effort  is  to  study  the  hypersonic  flow  field 
associated  with  a  shaped  charge  jet.  The  ultimate  concern  is  the  evaluation  of 
how  and  to  what  extent  aerodynamic  effects  cause  perturbations  to  the  jet.  The 
work  to  be  described  here  has  concentrated  on  development  of  numerical 
techniques  applicable  to  this  problem. 

A  shaped  charge  warhead  consists  of  a  cylindrical  explosive  charge  with  a 
conical  cavity  in  one  end.  The  cavity  is  typically  lined  with  a  hollow  coni¬ 
cal  copper  liner  of  about  2  mm  thickness.  The  shaped  charge  jet  is  formed  when 
a  detonation  wave,  traveling  through  the  surrounding  explosive,  implodes  the 
liner  upon  itself  with  such  force  that  a  stream  of  copper  is  ejected  along  the 
axis  of  the  cone.  A  precision  warhead  as  shown  in  Figure  1  produces  a  thin  jet 
traveling  at  a  speed  above  Mach  20  at  standard  sea  level  conditions. 

A  flash  radiograph  of  a  jet  from  the  BRL  3.2  inch  precision  shaped  charge 
is  shown  in  Figure  2.  The  break  up  Into  many  small  particles  is  characteristic 
of  all  jets.  As  long  as  the  particles  remain  aligned,  the  jet  is  highly  lethal 
If  the  jet  particles  are  perturbed  because  of  aerodynamic  interference  between 
particles  or  because  of  wave  reflections  from  near  by  surfaces,  its  lethality 
can  be  seriously  degraded. 

While  considerable  research  has  been  conducted  in  the  fields  of  jet 
formation  and  jet  penetration,  little  effort  has  been  devoted  to  studying  the 
aerodynamic  forces  that  influence  the  jet.  Many  examples  have  been  obtained 
which  show  jets  disturbed  in  passing  through  but  not  touching  various  geo¬ 
metries,  and  it  was  concluded  that  aerodynamic  forces  were  most  probably 
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responsible.  Experiments  were  initiated  to  eliminate  the  aerodynamic  factors  by 
producing  a  jet  in  a  vacuum.  However,  the  experiments  were  inconclusive  because 
of  the  difficulty  of  maintaining  the  vacuum  during  the  penetration  process. 


Yen1,  under  contract  to  the  Ballistic  Research  Laboratory,  attempted  to 
developed  a  better  understanding  of  the  interaction  of  the  flow  field  with  the 
jet.  His  effort  concentrated  on  the  wake  behind  the  particle  and  provided  very 
limited  results. 

The  approach  adopted  in  this  work  is  to  extend  and  apply  techniques 
developed  for  solving  the  ballistic  reentry  problem  to  the  special  situation  of 
the  shaped  charge  jet  as  it  passes  near  interfering  surfaces.  The  method 
employed  was  originally  developed  by  S.K.  Godunov2’3  in  1959-1961  and  applied  to 
the  hypersonic  blunt  body  by  Masson  et  al4 .  The  technique  has  also  been  used  at 
BRL  to  simulate  the  flow  field  near  the  muzzle  blast5’6. 

The  numerical  method  will  be  briefly  outlined  in  the  following  section. 

The  basic  technique  for  solving  the  conservation  equations  has  been  extended  by 
coupling  to  it  a  program  to  compute  the  real  gas  thermodynamic  properties 
appropriate  to  the  hypersonic  flow  following  the  methods  developed  by  Hansen7. 
The  real  gas  jump  conditions  across  the  shock  waves  in  the  flow  are  evaluated 
using  a  method  proposed  by  Colella  and  Glaz*. 


I I. GODUNOV  TECHNIQUE 

In  this  section  the  essential  elements  of  the  Godunov  method  are  described 
as  applied  to  the  simplest  case  of  one  dimensional  flow.  The  applicable 
conservation  equations  for  the  axisymmetric  flow  which  are  used  in  the 
computation  are  then  described,  followed  by  the  discretization  actually  employed 
in  the  solution  algorithm.  Finally  in  this  section  some  details  of  the  real  gas 
method  are  presented. 


1-D  GODUNOV -RIEMANN  TECHNIQUE 

The  conservation  equations  for  mass,  momentum  and  energy  are  written  in 
integral  form  and  applied  to  the  physical  domain  divided  into  cells  of  width  Ax 
as  shown  in  Figure  3.  At  an  initial  instant  the  flow  variables  in  each  cell  are 
defined  as  spatial  average  values.  Discontinuous  changes  of  properties  in 
general  occur  at  the  cell  boundaries.  At  the  beginning  of  a  time  step,  the 
imaginary  diaphragm  at  the  cell  boundary  is  ruptured  and  compression  and 
expansion  waves  are  assumed  to  propagate  into  adjacent  cells.  This  is  analogous 
to  the  classical  shock  tube  problem.  The  conditions  behind  these  waves  are  well 
known  from  shock  tube  theory.  The  fluid  between  the  two  waves  can  be  considered 
as  two  fluids,  one  from  each  cell,  separated  by  the  contact  discontinuity.  The 
pressure  and  velocity  of  the  gas  is  the  same  on  both  sides  of  the  contact 
discontinuity.  As  the  waves  propagate  the  original  cell  boundary  lies  in  one  of 
four  possible  flow  regions.  Both  waves  may  move  into  the  cell  to  the  right,  or 
both  waves  move  to  the  left.  In  either  of  these  cases  the  flow  at  the  boundary 
is  defined  by  the  undisturbed  flow  in  left  or  right  cell  respectively.  The 
boundary  can  be  between  the  two  waves  whereupon  the  flow  is  determined  by 
whether  the  contact  discontinuity  moves  in  the  positive  or  negative  direction. 
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The  basic  equations  to  be  solved  for  conditions  behind  the  waves  can  be 
written  as  the  following  two  simultaneous  equations: 


a(u2 

-  uj  +  (p2  -  Pi)  -  0 

(1) 

b(u3 

-  u4)  -  (p3  -  p4)  -  0 

(2) 

where  the  subscripts  refer  to  regions  defined  in  Figure  3.  Because  of 
continuity  across  the  contact  discontinuity  u2  -  u3  -  u0  and  p2  -  p3  -  p„ .  The 
coefficients  a  and  b  are  mass  velocities  determined  by  the  respective  wave 
speeds  and  the  density  across  the  waves.  In  general  a  and  b  are  functions  of 
the  pressure  behind  the  waves  which  makes  the  solution  for  p0  and  u0  nonlinear. 
Iterative  techniques  are  used  to  solve  Equations  (1)  and  (2)  except  when  the 
waves  are  weak,  in  which  case  a  linear  approximation  is  valid.  With  a 
sufficiently  fine  grid  most  of  the  shock  free  flow  field  can  be  obtained  using 
the  simpler  linear  approximation.  The  real  gas  relationships  between  thermody¬ 
namic  variables  also  complicates  the  calculations  and  will  be  discussed  later. 

The  essential  element  in  the  technique  is  that  the  properties  are  known  and 
constant  at  the  cell  boundary  until  the  arrival  of  waves  developed  at 
neighboring  cell  boundaries.  If  the  time  step  of  the  calculation  is  kept  less 
than  the  time  required  for  the  waves  to  cross  the  cell  then  the  fluxes  at  the 
cell  boundaries  are  easily  evaluated.  The  average  properties  in  the  cell  at  the 
end  of  the  time  step  are  then  determined  as  the  initial  value  plus  the  fluxes 
across  both  boundaries  during  the  time  step.  Thus  a  time  marching  scheme  is 
defined  which  progresses  from  an  imposed  initial  condition  to  a  steady  state. 

In  the  case  of  an  adaptive  cell  distribution,  the  fluxes  are  calculated 
taking  into  account  the  relative  velocity  between  the  fluid  and  the  moving  cell 
boundary . 


CONSERVATION  EQUATIONS 

The  fundamental  elements  of  the  Godunov  method  described  for  the  simpler 
one  dimensional  problem  have  been  extended  by  Godunov  and  many  others  to  two 
dimensional  axisymmetric  flows.  The  flow  near  the  leading  element  of  the  shaped 
charge  jet  is  assumed  to  be  axisymmetric  thus  the  starting  point  is  the 
conservation  equations  in  cylindrical  coordinates,  as  follows: 


+  k(f,u) 

+  §7<pv) 

+ 

£V 

r 

-  0 

(3) 

+  §^(p+pu2) 

+  §7</mv) 

+ 

guv 

r 

-  0 

(M 

f^(pv) 

+  k(puv) 

+  f^(P+PV2) 

+ 

gv2 

r 

-  0 

(5) 

(5) 
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For  Che  blunt  nose  part  of  the  calculation  it  is  convenient  to  define  the 
cells  in  a  polar  coordinate  system  because  the  blunt  body  shock  wave  is  nearly 
concentric  to  the  spherical  surface  in  the  stagnation  region.  Figure  4 
illustrates  how  the  cell  geometry  is  defined  in  this  region.  The  shock  loca¬ 
tion  is  estimated  initially  and  the  grid  dimensions  are  allowed  to  change  in  the 
radial  direction  until  the  steady  state  shock  position  is  obtained.  The  above 
differential  equations  of  motion  are  integrated  over  a  cell  area  as  shown  in  the 
figure  and  Table  1  gives  the  resulting  discretized  equations.  Note  that  R,U,V 
and  E  are  the  density,  x  and  r  components  of  velocity  and  total  energy 
respectively,  evaluated  on  the  cell  boundary.  Thus  these  properties  are 
obtained  from  the  solution  of  a  Riemann  problem  at  that  boundary.  The 
subscripts  indicate  which  boundary  as  defined  in  Figure  4.  Note  that  on  radial 
boundaries  the  fluxes  are  evaluated  using  the  velocity  W  which  is  the  component 
of  the  velocity  vector  normal  to  the  cell  boundary.  On  the  moving 
circumferential  boundaries  the  flux  is  determined  by  the  relative  velocity 
component  normal  to  the  moving  element,  (W-q) .  Pressure  forces  contribute  to 
the  momentum  equations  depending  on  the  orientation  of  the  cell  boundary 
relative  to  the  cylindrical  coordinate  system;  thus  the  angles  9  and  <f>  which 
specify  the  orientation  of  the  boundary  must  be  included. 

In  both  the  hemisphere  and  cylindrical  computation  the  downstream  boundary 
was  located  in  a  supersonic  flow  region  where  the  wave  system  moves  down¬ 
stream.  Thus  the  flux  conditions  on  the  boundary  of  the  cell  are  determined  by 
the  properties  of  the  cell,  and  no  special  boundary  condition  is  required. 

The  last  term  in  each  equation  caused  some  computational  difficulties  for 
cells  with  the  axis  of  symmetry  as  a  boundary.  Although  the  ratio  of  v/r  is 
finite  on  the  boundary,  its  evaluation  introduces  errors  which  lead  to 
instabilities  in  the  computation.  One  source  of  the  difficulty  is  related  to 
discretization  of  the  shock  wave  at  the  axis  of  symmetry  where  it  should  be 
normal.  In  the  present  computations  the  slope  of  the  shock  at  the  first  cell 
with  its  boundary  on  the  axis  tended  to  an  unrealistically  negative  value.  By 
arranging  the  cells  so  that  the  center  of  the  first  cell  falls  on  the  axis,  the 
normality  of  the  shock  on  the  axis  is  insured.  This  occurs  because  all 
conditions  on  the  two  radial  boundaries  are  forced  to  be  symmetrical  as  the 
appropriate  boundary  condition.  The  line  segment  representing  the  stagnation 
region  shock  wave  is  then  automatically  normal  to  the  axis. 

REAL  GAS  EFFECTS 

In  the  initial  phases  of  the  development  of  the  numerical  code,  a  perfect 
gas  equation  of  state  was  assumed,  ie.: 

P  -  pRT  (7) 

e.  “  (8) 

At  Mach  20,  however,  the  relationship  between  pressure  and  temperature  and  the 
other  thermodynamic  variables  is  much  more  complex.  Although  the  magnitude  of 
the  pressures  calculated  using  perfect  gas  formulas  are  approximately  correct, 
the  shock  wave  position  and  thus  pressure  distribution  are  strongly  affected  by 
real  gas  density  and  temperature. 
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The  thermodynamic  properties  of  high  temperature  air  are  computed  from 
approximate  partition  functions  for  the  major  components  of  air  using  a  tech¬ 
nique  developed  by  Hansen7 .  The  model  assumes  air  to  be  a  mixture  of  20  percent 
oxygen  and  80  percent  nitrogen,  all  other  components  are  neglected.  Eleven 
species  are  considered  including  three  levels  of  ionization  of  oxygen  and 
nitrogen. 


The  method  calculates  all  the  thermodynamic  variables  given  the  pressure 
and  temperature  of  the  gas.  This  is  inconvenient  when  coupled  to  the  Godunov 
code  because  its  primary  dependent  variables  include  density  and  energy.  An 
iteration  procedure  is  required  as  an  intermediate  step  to  search  for  the  cor¬ 
rect  values  of  temperature  and  pressure  which  correspond  to  given  density  and 
energy. 


A  second  major  problem  with  adding  real  gas  effects  is  that  the  above 
search  for  the  thermodynamic  variables  makes  the  computation  of  even  the  weak 
wave  Riemann  problem  nonlinear  and  iterative.  In  order  to  avoid  increasing  the 
running  time  of  the  computation  extensively,  a  procedure  suggested  by  Colella 
and  Glaz8  has  been  adopted.  Their  method  permits  evaluation  of  the  pressure  and 
velocity  at  the  contact  discontinuity  based  on  conditions  in  adjacent  cells.  In 
the  strong  shock  case,  it  is  still  necessary  to  iterate  as  in  the  perfect  gas 
case,  but  it  is  not  necessary  to  iterate  at  the  same  time  to  find  the 
thermodynamic  variables . 


I I I. RESULTS 


The  results  of  the  computations  are  summarized  by  first  considering  the 
hemisphere  problem  at  the  relative  low  speed  of  Mach  4  where  experimental 
verification  can  be  made.  Some  results  for  the  flow  downstream  between  con¬ 
centric  cylinders  ar?  considered  to  illustrate  the  application  of  the  shaped 
charge  jet  passing  through  a  cylindrical  tube.  Finally  hemisphere  and  cylinder 
computations  at  Mach  number  20.45  are  presented. 


HEMISPHERE  AT  MACH  NUMBER  4 


Figure  5  shows  the  shock  wave  stand-off  distance  plotted  around  the  hemi¬ 
sphere  and  compared  to  experimental  data9 .  This  Mach  4  case  was  chosen  for 
these  initial  code  verification  runs  because  of  the  existence  of  the  experi¬ 
mental  wind  tunnel  observations  and  because  this  was  one  of  Godunov's  original 
test  cases.  These  results  were  obtained  with  a  relatively  coarse  grid  of  only  8 
points  radially  and  25  cells  in  the  angular  direction.  Tests  with  other  grid 
distributions  showed  only  very  minor  changes.  Figure  6  shows  the  corre¬ 
sponding  pressure  distribution  on  the  hemisphere  again  compared  to  the  same 
experiment.  Note  that  the  real  gas  form  of  the  code  was  used  even  though  the 
conditions  were  essentially  those  of  a  perfect  gas.  The  code  under  predicts  the 
measured  stagnation  pressure  by  less  than  2.5  per  cent.  There  is  a  small  kink 
in  the  pressure  curve  at  about  45  degrees  from  the  stagnation  point  which 
appears  to  be  associated  with  the  sonic  line  in  the  flow  field.  It  is  some  what 
exaggerated  because  of  the  coarse  grid  used.  The  mass  density  distribution  is 
shown  in  Figure  7  and  the  current  calculations  are  compared  to  the  original 
results  of  Godunov  and  to  a  well  know  calculation  of  Belotserkovskii1 0  using  the 
method  of  integral  relations.  The  Belotserkovskii  calculation  is  about  2 
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percent  higher  at  the  stagnation  point  but  otherwise  the  agreement  is  very  good. 
Godunov's  calculation  is  lower  than  the  others  and  is  significantly  different  at 
the  stagnation  point.  The  reason  for  the  disagreement  appears  to  be  because 
Godunov's  grid  provided  for  a  grid  boundary  on  the  line  of  symmetry  which 
introduced  errors  that  do  not  disappear  as  steady  state  is  approached. 

ANNULAR  REGION  AT  MACH  NUMBER  4 

Once  the  hemisphere  computation  was  completed,  it  was  used  to  provide 
upstream  boundary  conditions  for  the  computation  of  the  flow  between  a  cylin¬ 
drical  afterbody  and  a  concentric  wall.  This  configuration  is  meant  to  simu¬ 
late  a  jet  passing  through  a  cylindrical  tube.  Unlike  the  hemisphere  computa¬ 
tion,  however,  the  grid  or  cell  distribution  was  fixed  in  space  with  40  cells 
radially  and  120  axially.  Uniform  initial  conditions  were  assumed  and  the  code 
marched  in  time  until  steady  state  was  achieved.  Figure  8  is  an  example  of  a 
contour  plot  of  the  pressure  from  such  a  calculation.  The  outer  wall  diameter 
is  1.766  body  diameters.  The  hemisphere  shock  wave  reflects  off  the  outer  wall 
and  produces  a  nearly  normal  shock  standing  on  the  body.  Such  shocks  impinging 
on  the  jet  could  produce  strong  aerodynamic  forces  at  the  higher  Mach  numbers  of 
interest . 


MACH  NUMBER  20.45  RESULTS 

Figures  9  and  10  show  the  results  of  the  hemispherical  computation  for  Mach 
number  20.45.  The  stagnation  pressure  in  this  case  is  only  a  few  percent  above 
the  539  atmospheres  predicted  by  perfect  gas  theory.  The  result  of  the  cylinder 
computations  is  shown  in  figure  11,  for  a  tube  to  jet  diameter  ratio  of  2.50. 

The  bow  shock  is  plainly  visible.  The  shock  angle  at  the  wall  is  12.8°,  and  the 
reflection  occurs  at  a  position  three  body  diameters  downstream  of  the  tangent 
point  between  the  hemisphere  and  cylinder.  The  pressure  on  the  wall  behind  the 
reflected  wave  is  71  atm.  The  reflected  wave  can  be  traced  back  to  the  body, 
where  it  again  reflects.  The  pressure  behind  this  reflection  is  34  atm. 


IV.  SUMMARY  AND  CONCLUSIONS 

This  report  has  presented  the  current  status  of  an  on  going  program  to 
calculate  the  aerodynamic  forces  on  a  hypersonic  shaped  charge  jet.  A  numerical 
technique  based  on  the  work  of  S.  K.  Godunov  has  been  .modified  consistent  with 
the  blunt  jet  configuration  penetrating  a  cylindrical  tube  and  extended  to 
include  real  gas  properties.  The  ability  of  the  code  to  simulate  Mach  number  4 
conditions  for  which  experimental  and  other  numerical  data  are  available  has 
been  used  to  validate  the  procedure.  Numerical  results  have  been  completed  for 
the  Mach  number  20.45  jet  problem.  These  computations  will  be  compared  to 
experimental  measurements  of  the  reflected  wave  from  an  actual  shaped  charge  jet 
going  through  a  cylindrical  tube.  Results  from  these  experiments  should  be 
reported  in  the  near  future. 
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Figure  8.  Pressure  Contour  Plot  in  Annulus,  Mach  4.0. 
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SKEW  GRIDS  AND  IRROTATIONAL  FLOW 


Robert  S.  Bernard 
Hydraulics  Laboratory 

U.S.  Army  Engineer  Waterways  Experiment  Station 
P.0.  Box  631,  Vicksburg,  MS  39180-0631 

ABSTRACT.  Finite-difference  computation  of  incompressible  flow  through 
regions  of  arbitrary  shape  often  requires  the  implementation  of  boundary- 
fitted  coordinates  for  which  the  grid  lines  may  be  non-orthogonal  (skew). 
When  the  governing  equations  are  expressed  in  terms  of  pressure  and  velocity, 
conservation  of  mass  is  maintained  by  the  gradient  of  the  pressure.  In 
principle,  the  gradient  is  irrotattonal  and  should  have  no  effect  on  the 
existing  circulation  in  the  flow  field;  but  if  the  grid  lines  are  skew,  the 
discrete  representation  of  the  gradient  can  generate  spurious  vorticity  near 
the  boundaries.  In  the  present  work  this  difficulty  is  eliminated  for  uniform 
skew  grids,  and  markedly  reduced  for  non-uniform  skew  grids,  by  adopting  a 
discrete  formulation  of  the  pressure  gradient  that  helps  maintain  irrotation- 
ality  near  boundaries.  The  procedure  is  applicable  for  staggered  grids  with 
either  Poisson  or  Chorin  equations  for  pressure. 

I.  INTRODUCTION.  The  role  of  the  pressure  for  incompressible  flow  is 
simply  to  constrain  the  velocity  vector  _u  such  that 

V  •  u  0  ( 1  ) 

which  represents  conservation  of  mass.  Assuming  that  the  velocity  field 
conserves  mass  at  some  time  t  ,  let  ij'  be  a  velocity  field  that  would 
exist  at  time  t’  =  t  +  dt  in  the  absence  of  pressure.  Then  the 
corresponding  mass-conserving  velocity  field  can  always  be  written  as 

u  =  u'  -  p  ^ V 4>  (2 ) 

where  p  is  the  density  and  the  scalar  potential  $  is  related  to  the  pressure 
P  hy 


$  =  p  dt  (3) 

Combining  Equations  1  and  2,  it  follows  that  $  satisfies  the  Poisson  equation, 

V24>  =  pV  •  u’  (4) 

As  long  as  the  primitive  variables  u  and  p  are  retained  as  the  unknowns  in 
the  governing  equations  of  motion,  then  it  is  necessary  to  solve  Equation  4  in 
order  to  maintain  the  constraint  given  by  Equation  1.  Even  if  Chorin's  method 
of  pseudo-compressibility  [1]  is  used  to  add  a  time  derivative  of  pressure  to 
Equation  1 ,  the  end  result  is  equivalent  to  having  solved  Equation  4  when  the 
flow  reaches  steady  state. 
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The  purpose  herein  is  not,  however,  to  discuss  the  pros  and  cons  of 
pseudo-compressibility  versus  Poisson  equations  in  the  numerous  existing 
algorithms  for  solving  the  coupled  momentum  and  continuity  equations.  It  is, 
rather,  to  consider  the  proper  formulation  of  discrete  approximations  for  the 
pressure  gradient  and  its  boundary  conditions  on  non-orthogonal  curvilinear 
grids.  Improper  treatment  of  the  gradient  near  computational  boundaries  can 
add  circulation  to  the  flow,  in  which  case  the  discrete  representation  of  the 
gradient  is  not  irrotational  as  it  should  be.  This  3ort  of  error  does  not 
arise  if  the  computational  grid  is  orthogonal;  but  if  the  grid  is  non- 
orthogonal,  special  treatment  of  the  derivatives  in  the  gradient  is  necessary 
to  avoid  or  minimize  the  creation  of  spurious  vorticity.  The  objective  of  the 
present  work  is  to  ascertain  what  that  treatment  might  be  for  the  case  of 
non-orthogonal ,  non-uniform,  curvilinear  f inite-dif ference  grids. 

II.  DISCRETE  FORMULATION.  Consider  a  two-dimensional  staggered  grid  of 
the  Marker-and-Cell  type  [2] ,  with  the  pressure  (and  the  scalar  potential) 
computed  at  the  cell  centers,  and  the  velocity  components  (u,v)  at  the 
midpoints  of  the  cell  faces,  as  shown  in  Figures  1  and  2.  Assuming  a  unit 
density  and  a  unit  depth  normal  to  the  page,  the  mass-flux  components  through 
the  right  (east)  and  upper  (north)  cell  faces  are  denoted  by  U  and  V  , 
respectively,  and  are  related  to  the  cartesian  velocity  components  (u,v)  by 

U  -  y  u  -  x  v  (5) 

n  n 

V  -  x^v  -  y^u  (6) 

The  curvilinear  coordinates  (£»n)  follow  the  grid  lines  shown  in  Figure  1,  and 
they  are  functions  of  the  cartesian  coordinates  (x,y).  Conservation  of  mass 
for  each  grid  cell  demands  that 

U _  +  V  -  0  (7) 

5  n 

Denoting  non-conservative  velocities  and  fluxes  with  a  prime,  we  then  have  the 
relations 

u  -  u*  -  $  (3) 

v  «  v»  -  *  (9) 

Using  the  chain  rule  [3]  to  evaluate  the  x-  and  y-components  of  the  gradient, 
we  find  that 

*x  “  J_1  (Vc  "  Vn}  (10) 


$ 


M 

si 


Combining  Equations  5  through  1 1 ,  we  obtain  the  discrete  analog  of  Equation  4, 


A  -  A  +  B  -  B 
e  w  n  s 


The  sub/superscr ipts  (e,  w,  n,  s)  indicate  quantities  evaluated  on  the  (east, 
west,  north,  south)  cell  faces,  as  shown  in  Figure  2,  and 


A  =  c«f>. 


B  =  -  Y$ 


-1  /  2  2, 
a  =  J  (x  +  y  ) 
n  n 


-1  ,  2  2, 

J  (x  +  y^) 


y  =  j 
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Note  that  the  right-hand  side  of  Equation  13  is  simply  the  imbalance  in  mass 
flux  produced  by  the  nonconservative  fluxes  (U'.V),  while  the  left-hand  side 
is  the  sum  of  the  flux  corrections  provided  by  the  gradient  of  <j>  .  After 
Equation  13  has  been  solved  for  <J>  ,  and  the  flux  corrections  computed 
therefrom,  the  mass-conserving  fluxes  can  be  obtained  from 


The  particular  form  exhibited  by  Equation  13  facilitates  the  elimination  of 
superflous  derivatives  when  one  or  more  of  the  cell  faces  coincides  with  a 
boundary  where  there  is  to  be  no  adjustment  to  U’  or  V'  .  Such  is  the  case 
for  solid  boundaries  and  for  open  boundaries  where  the  flux  normal  to  the 
boundary  is  known  or  specified.  For  example,  if  the  east  cell  face  coincides 
with  a  boundary,  then  Ag  =  0  and  Equation  13  reduces  to 


'Aw  +  Bn  ~  Bs 


u-  -  u*  *  V-  - 


Likewise,  if  the  north  cell  face  coincides  with  a  boundary,  then  3n  =  0  ,  and 
Equation  13  reduces  to 


A  -  A 
rte  rtw 


B  =  U'  -  U'  +  V'  -  V 
s  e  w  n  3 


If  the  grid  is  orthogonal  (Y  =  0)  there  is  no  question  as  to  how  to  proceed. 
In  the  case  of  Equation  21  ,  only  n-derivatives  are  needed  on  the  north  and 
south  faces,  and  these  can  be  evaluated  from  information  inside  the  field. 
Similarly,  for  Equation  22,  only  £-deri vati ves  are  needed  on  the  east  and  west 
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faces.  Assuming  the  grid  has  unit  spacing  ( A£  =  An  =  1 )  in  the  computational 
U,n)  space  [4],  then 


ee  Tcc 


Tnn  Tcc 


The  double  subscripts  (cc,  ee,  ww,  nn,  ss,  ne,  nw,  se,  sw)  indicate  quantities 
at  the  centers  of  neighboring  cells  (central,  east,  west,  etc.)  as  shown  in 
Figure  2. 


If  the  grid  is  non-orthogonal  (Y  *  0)  ,  then  ^-derivatives  are  needed  on 
the  north  and  south  faces,  and  rrderivatives  are  needed  on  east  and  west 
faces.  On  cell  faces  not  touching  boundaries,  we  can  approximate  these  with 
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But  for  cell  faces  with  one  end  touching  a  boundary,  there  is  an  ambiguity: 
Equations  25  and  26  require  information  across  the  boundary  in  cells  lying 
outside  the  flow  field.  The  ambiguity  arises  because  there  are  a  number  of 
plausible  ways  to  compute  the  needed  information,  but  no  indication  a  priori 
as  to  which  one  is  best.  In  order  to  examine  the  possibilities,  let  us  focus 
attention  on  a  cell  whose  east  face  coincides  with  a  boundary. 


We  must  find  an  approximation  for  0  on  the  north  and  south  cell  faces, 
and  we  shall  consider  only  three  possibilities  although  there  certainly  exist 
others.  The  first  is  simply  to  replace  0  on  the  north  face  by  its  value  on 
the  northwest  corner  of  the  cell,  using  the  difference  expression, 
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Equation  27  imposes  a  ^-derivative  using  information  in  the  flow  field, 
irrespective  of  what  happens  on  the  boundary.  For  reference  we  shall  call 
this  the  "field  approximation".  The  second  possibility  is  to  calculate  0  on 
the  north  face  by  using  the  same  condition  that  exists  on  the  east  face. 
Specifically,  on  the  east  face  we  have  the  boundary  condition, 
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Applying  the  same  constraint  on  the  north  face,  we  obtain 
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and  the  discrete  approximation  for  0r  becomes 
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Equation  30  imposes  a  ^-derivative  using  only  the  boundary  constraint, 
irrespective  of  what  happens  in  the  flow  field.  For  reference  we  shall  call 
it  the  "boundary  approximation".  As  the  third  alternative,  we 
approximate  $  by  a  simple  average  of  Equations  27  and  30,  which  we  shall  call 
the  "mixed  approximation", 


$  )  +  -r  ( <p 

cc  H  .nn 


We  now  have  three  discrete  alternatives  for  representing  ambiguous 
derivatives  of  $  adjacent  to  . boundaries.  Equations  27,  30,  and  31  pertain 
to  ^-derivatives  for  north  faces  touching  boundaries  of  constant  £  .  The  same 
principles  apply  for  ambiguous  derivatives  on  other  faces  touching  (but  not 
coincident  with)  flow  field  boundaries. 


III.  TEST  CASES  AND  RESULTS.  In  order  to  ascertain  which  of  the  three 
alternatives  is  best  for  representing  ambiguous  derivatives  of  <p  ,  we  need 
test  problems  that  show  clearly  the  adverse  affects  of  grid  skewness. 
Moreover,  we  should  pay  special  attention  to  the  possible  creation  of  spurious 
vorticity  arising  from  improper  representation  of  the  gradient  near 
boundaries.  Thus  we  propose  two  classes  of  tests: 


1.  Non-orthogonal  grids  with  uniform  spacing. 


2.  Non-orthogonal  grids  with  non-uniform  spacing. 


The  first  category  allows  us  to  see  the  effects  of  skewness  alone,  while  the 
second  adds  the  possible  compounding  of  error  due  to  non-uniformity . 


In  all  cases  the  flow  field  in  the  physical  space  will  be  bounded  above 
and  below  by  solid  boundaries,  while  the  left  and  right  boundaries  will  be 
open  with  uniform  normal  components  of  velocity:  u  =  1  .  Inside  the  flow 
field  we  impose  the  velocity  condition:  u*  -  1  ,  v'  =  -10  .  We  then  solve 
Equation  13  subject  to  the  constraint  that  the  flux  normal  to  the  boundaries 
remain  fixed  and  no  vorticity  be  created  in  the  flow  field.  The  large 
vertical  velocity  creates  a  proportionately  large  violation  of  continuity  at 
the  upper  and  lower  solid  boundaries,  which  is  to  be  eliminated  by  the 
gradient  of  the  scalar . potential .  In  all  cases  the  physical  boundaries  are 
chosen  such  that  the  resulting  streamlines  should  be  straight  lines,  and  any 
deviation  therefrom  indicates  the  presence  of  error. 


Results  from  the  first  three  test  cases  are  presented  in  Figures  3 
through  5,  showing  computed  streamlines  for  the  flow  through  parallelograms 
uf  increasing  skewness.  Even  at  10  degrees,  the  boundary  and  mixed  approxima¬ 
tions  exhibit  an  unacceptable  amount  of  circulation,  while  the  field 
approximation  produces  straight  lines  for  each  case.  The  computed  solutions 
were  all  converged  to  a  maximum  residual  of  less  than  0.001  in  Equation  13, 
with  the  residual  z  defined  by 
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Figure  b.  Grid  and  commuted  streamlines  for  uniform,  irrctaticnal  flo' 
through  rectangular  channel. 
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The  last  test  involves  a  grid  with  non-uniformity  as  well  as  skewness, 
shown  in  Figure  6.  In  this  case  the  flow  field  is  rectangular  in  the  physical 
(x,y)  space,  but  L-shaped  in  the  computational  (£,n)  space.  No  one  would 
actually  use  such  a  distorted  grid  for  serious  computation,  but  it  serves  our 
needs  in  that  it  allows  us  to  observe  directly  the  grid-induced  error  in  a 
flow  where  we  know  the  exact  solution  in  advance  (u  =  1).  Moreover,  the  large 
continuity  violation  associated  with  the  initial  vertical  velocity  (V  =  -10) 
is  intended  to  magnify  the  error.  As  in  the  examples  for  uniform  grids,  the 
field  approximation  generates  far  better  results  than  the  boundary  and  mixed 
approximations,  but  there  is  still  some  distortion  of  the  streamlines  even 
with  the  field  approximation. 


IV.  CONCLUSION.  Three  alternatives  have  been  proposed  for  representing 
ambiguous  derivatives  in  the  pressure  gradient  on  non-orthogonal  grid  cells 
adjacent  to  flow  field  boundaries.  For  the  test  cases  presented  herein,  the 
be3t  results  were  obtained  with  the  field  approximation;  that  is,  by  replacing 
the  ambiguous  derivative  on  a  cell  face  by  its  value  on  the  adjacent  cell 
corner  lying  in  the  field  (rather  than  on  the  boundary).  Using  this  approach, 
the  discrete  pressure  gradient  remains  irrotational  for  uniform  skew  grids, 
creating  no  spurious  vorticity  whatsoever.  The  presence  of  non-uniformity  and 
skewness  together,  however,  can  generate  grid-induced  vorticity  even  with  the 
field  approximation.  Thus,  while  non-orthogonal  staggered  grids  can  indeed  be 
used  for  computing  incompressible  flow,  it  is  advisable  to  keep  the  grid  as 
smooth  as  possible  in  the.  presence  of  skewness. 
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ABSTRACT.  Maximal  length  linear  shift  register  sequences 
(m-sequences)  are  used  in  a  number  of  communications  applications.  Their 
nearly  ideal  randomness  properties  are  what  make  these  sequences  so 
employable.  In  this  note  we  discuss  an  additional  randomness  property  that 
m-sequences  possess. 

I.  INTRODUCTION.  Maximal  length  shift  register  sequences 
(m-sequences)  have  been  used  in  a  myriad  of  applications  for  high-speed 
communications.  The  nearly  ideal  randomness  properties  of  m-sequences  is 
the  primary  reason  for  their  extensive  applicability.  The  balance  and  run 
properties  are  intrinsic  in  other  sequences  such  as  full  sequences  of 
length  2n  [1],  However,  it  is  the  correlation  property  (or  shift-and-add 
property)  which  makes  m-sequences  useful  for  spread  spectrum  [ 2 ] - [ 3 ] , 
synchronization  [4],  range-radar  [5],  error  correction  [6] -[7],  random 
number  generation  [8]  and  other  applications.  For  additional  properties  of 
m-sequences,  applications  and  methods  for  their  generation,  see  [9] -[10], 

In  this  note  we  discuss  another  randomness  property  which  is  also  possessed 
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by  m-sequences.  Actually  this  property  is  equivalent  to  the  shift-and-add 
property,  though  not  obviously  so. 


2.  A  RANDOMNESS  PROPERTY.  Suppose  a  pair  of  distinct  elements  (A,B) 
is  drawn  at  random  from  the  set  T  -  {  1 , 2 , 3 . . . , 2n-2  }  where  A  <  B.  We 
seek  the  expected  value  of  A  and  B. 


The  number  of  ways  of  selecting  any  pair  of  numbers,  without  replacement, 
from  a  set  of  (2n-2)  elements  is  [ (2n-2) (2n-3)/2]  -  C(2n-2,2)  i.e.  the 
combination  of  2n-2  objects  taken  2  at  a  time.  If  A  is  the  smaller  of  the 
two  integers  and  is  equal  to  i,  then  there  are  (2n-2-i)  possible  choices 
for  the  larger  integer  B.  Since  each  pair  (A,B)  is  equally  likely  to  be 
selected,  the  probability  that  A  equals  the  integer  i  is  given  by 


2n-2-i 


Pr [A-i] 


[ (2n-2) (2n-3)/2] 


The  expected  value  of  A  is  the  weighted  average  of  the  possible  values 
that  A  can  take  on.  The  expected  value  of  A  is  thus  given  by 


E[A]  - 


2n-3 

l  i  *  Fr[A-i] 
i-1 
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X  i  *  (2n-2-i)/M 

i-1 


where  M  -  ( (2n-2) (2n-3)/2] 
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-  1/M  [ (2n-2)2  (2n-3)/2 

-  (2n-3) (2n-2) (2n+1-5)/6] 

-  (2n-2)  -  (2n+1- 5)/3 

-  (2n-l)/3 . 

The  probability  that  (B-j)  is  given  by 

Pr[B-j]  -  ( j - 1)/M 

The  expected  value  of  B  is  then 


E[B]  -  l  j  *  Pr[B-j] 
j-2 


2n-2 

“  I  j  *  ( j  *1)/M 

j-2 

2n-2 

-  1/M  l  j  *  (j-1) 

j-2 

-  1/M  [(2n-2)(2n-l)(2n+1-3)/6 

-  1  -  (2n-2)(2n-l)/2  +  1] 

-  ( 2n - 1 ) (2n+1-3)/[ 3  *  (2n-3) ] 

-  (2n-l)/(2n-3) 

-  (2n-l) (2n+1-6)/[3  *  (2n-3) ) 

-  2  *  (2n-l)/3 . 

Thus,  E[B]  -  2  *  E[A]  -  2  *  (2n-l)/3  when  selecting  a  pair  of 

numbers  (A,B),  where  (A  <  B) ,  at  random,  without  replacement,  from  the  set 
T  -  (1,2,3, .. . , 2n- 2 } . 
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3.  THE  SHIFT-AND  ADD  PROPERTY.  Let  S  be  a  sequence  of  period  2n-l 
generated  by  a  primitive  polynomial  over  GF(2)  of  degree  n.  sJ  is  the 
sequence  formed  by  cyclically  shifting  S  by  j  bits  to  the  left.  By  the 
shift-and-add  property  of  m-sequences  the  modulo  2  sum  of  S  and  SJ 
equals  Sk  (S  ®  SJ  -  Sk)  where  the  shift  k  <  2n-l  is  uniquely 
determined  by  the  shift  j.  Let  {(j^.k^))  be  the  set  of  all  shift-and-add 
pairs  for  a  primitive  polynomial  f(x)  of  degree  n  over  GF(2)  where  j ^  <  k^ 
<  2n-l.  Let  the  set  { j ^ }  -  J  and  the  set  (k^)  -  K.  Note:  it  can  be  shown 
that  there  are  (2n-2)/2  distinct  shift-and-add  pairs  for  each  primitive 
polynomial  of  degree  n  over  GF(2)  and  J  and  K  partition  the  set 
(1,2,3 . 2n-2 } . 


Since  f(x)  generates  the  sequence 


S  -  S(x)  -  ( sq)  +  (s1)x  +  (s2)x2  + 


we  see  that  f(x)  ■  S(x)  -  0.  That  is  f(x)  annihilates  S(x)  .  Since 
f(x)  primitive,  f(x)  is  the  minimal  generator  of  S(x).  If  S  ®  SJ  ®  Sk  - 
S (x)  +  (xJ  *  S (x) )  +  (xk  *  S(x) )  -  0,  then  (1  +  xJ  +  xk)  •  S(x)  -  0  and 
(1  +xJ  +  xk)  also  annihilates  S(x).  Therefore,  f(x)  divides  every 
trinomial  defined  by  a  shift-and-add  pair.  Because  f(x)  is  an  irreducible 
polynomial , 


f(x)/  xc  for  any  t.  Since  f(x)|(l  +  xJ  +  xk)  then 


f(x) | (1  +  xJ  +  xk)(x'J)  -  (x*J  +  1  +  xk*J)  and 


f(x) | (1  +  xk‘J  +  x2  Therefore  (k- j ,  2n-l-j)is  a  shift-and-add 
pair  where  (k-j)  <  (2n-l-j).  Hence,  (k-j)  is  an  element  of  J  and  ( 2n - 1 - j ) 
is  an  element  of  K.  As  a  result,  the  sets  ( - J i )  and  ( j i )  are  equal  and 


X  <ki  -  Jt>  -  X  Ji 


m. 


from  which  it  follows  Chat 


*•< 


V 


m 


1 


>! 


8 


5:! 

& 


t 


I 


& 


s 


§ 


5 


y 


*!• 


I  ki  “  2  £  Ji. 


This  is  equivalent  to  the  expected  statistical  result  determined  above  when 
pairs  of  numbers  are  drawn  at  random  without  replacement,  from  the  set 

{1,2,3 . 2n-2).  Thus,  sequences  generated  by  a  primitive  polynomial  have 

the  randomness  property  shown  by  randomly  selecting  pairs  of  numbers  from  a 
finite  set  of  size  2n-2. 


1.  Fredricksen,  H. ,  "A  survey  of  Full  Length  Nonlinear  Shift  Register 


Cycle  Algorithms", 


:,  #2.,  Aptil  1982,  pp.195  -  221. 


2.  Simon,  M.  et.al. , 
Press,  1985. 


i,  Computer  Science 


3.  Dixon,  R.  C. , 


L,  Wiley,  1976. 


4.  Stiffler ,  J .  J . , 
Hall,  1971. 


:,  Prentice 


5 .Golomb ,  S .  W. , 
Hall  1964. 


.ommunlcj 


i,  Prentice 


6.  Peterson,  W.  W.  and  Weldon,  E.  J.  Jr., 
Press,  1972. 


MIT 


7.  Solomon,  G.  (in  Balakrishnam,  A.), 
1967. 


;,  McGraw  Hill , 


8.  Tausworthe,  R.  C.,  "Random  Numbers  Generated  by  Linear  Recurrences 
Modulo  Two",  Math.  Computation  19 .  April  1965. 


9 .  Golomb ,  S .  W. , 


[,  Aegean  Park  Press,  1982. 


10.  Selmer ,  E.  S.,  Linear  Recurrence  Relations  over  FlnL 
Mathematics  Department,  U.  of  Bergen,  Bergen,  Norway,  1964. 


IS 


THERMODYNAMIC  GAUGE  THEORY  OF  SOLIDS  AND 
QUANTUM  LIQUIDS  WITH  INTERNAL  PHASE 


Richard  A.  Weiss 

U.  S.  Array  Engineer  Waterways  Experiment  Station 
Vicksburg,  Mississippi  39180 


ABSTRACT .  The  local  gauge  invariance  of  relativistic  thermodynamics 
under  phase  rotations  suggests  that  bulk  matter  systems  have  density  and 
temperature  dependent  internal  phase  angles  associated  with  the  state  func¬ 
tions.  A  procedure  for  determining  the  internal  phase  angles  associated 
with  energy,  pressure,  entropy,  thermodynamic  potentials,  and  gauge  para¬ 
meters  of  solids  and  quantum  liquids  is  presented  in  terms  of  the  renormal¬ 
ization  group  equations  of  relativistic  thermodynamics.  The  calculated  mag¬ 
nitudes  of  the  thermodynamic  state  functions  depend  on  the  values  of  the  in¬ 
ternal  phase  angles.  It  is  suggested  that  the  external  angular  momentum  of 
systems  may  be  coupled  to  the  angular  momenta  associated  with  the  internal 
space  of  thermodynamic  phase  angles.  Applications  to  mechanical  waves  in 
matter  with  internal  phase  are  considered.  These  effects  are  expected  to 
be  found  in  high  density  and  pressure  systems  such  as  atomic  nuclei,  neutron 
stars,  nuclear  explosions,  and -.the  interaction  of  directed  energy  beams  with 
matter. 


1 .  INTRODUCTION .  The  complete  understanding  of  matter  and  radiation 
at  high  densities  requires  a  locally  scale  and  gauge  invariant  theory  of  the 
forces  and  fields  that  determine  the  properties  of  a  physical  system. 1,2  The 
basic  forces  in  a  physical  system  are  associated  with  a  local  gauge  group,  as 
for  example  the  gauge  group  of  the  standard  model  of  the  strong  and  electro- 
weak  interactions  is  SU(3)C  x  SU(2)^  x  U(l)y  •  For  simple  systems,  such  as 
electromagnetism,  the  gauge  group  is  U(L)  the  group  of  phase  rotations.3  Local 
gauge  symmetry  has  unified  the  interactions  of  nature,  and  it  is  only  natural 
because  of  such  success  to  attempt  a  similar  synthesis  in  other  areas  of  phys¬ 
ics  such  as  thermodynamics  and  mechanics. 

The  vacuum  state  plays  an  important  role  in  the  development  of  local 
gauge  theories  of  the  four  fundamental  interactions.  It  produces  observable 
effects  in  quantum  electrodynamic  calculations  of  the  fermion  self-energy, 
vertex  modification,  and  vacuum  polarization  as  manifested  in  the  Lamb  shift.4' 
In  quantum  f lavordynamics  the  nonzero  expectation  values  of  the  vacuum  Higgs 
field  produces  the  spontaneous  symmetry  breaking  that  gives  rise  to  the  mas¬ 
sive  intermediate  vector  bosons  that  mediate  the  weak  interactions.1  In  quan¬ 
tum  chromodynamics  the  vacuum  polarization  leads  to  the  concepts  of  a  running 
coupling  constant  and  asymptotic  freedom  for  the  non-Abelian  gauge  theories.1  3 
The  question  then  arises  as  to  whether  vacuum  effects  appear  in  systems  at  the 
macroscopic  level,  and  whether  a  synthesis  of  thermodynamics  and  continuum 
mechanics  can  be  based  on  a  locally  gauge  invariant  theory  that  includes  the 
effects  of  the  vacuum  state. 
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As  part  of  a  general  program  to  determine  the  state  equation  of  systems 
at  high  densities,  a  local  gauge  theory  of  matter  and  radiation  has  been  de¬ 
veloped  that  is  based  on  a  gauge  and  scale  invariant  relativistic  trace  equa¬ 
tion  written  to  include  vacuum  effects  as  follows6 


3V  — (PV) 
dVv  U 


(i) 


where  U  =  relativistic  (renormalized)  internal  energy,  P  =  relativistic  pres¬ 
sure,  T  =  absolute  temperature,  V  =  volume  of  substance,  and  Ua  and  Pa  =  cor¬ 
responding  nonrelativistic  internal  energy  and  pressure.  Throughout  this  paper 
the  index  "a"  will  refer  to  nonrelativistic  (unrenormalized)  calculations .  The 
trace  equation  (1)  can  be  rewritten  as7*8 
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Equation  (1)  can  also  be  written  as 


E  +  |  Cv  -  3(P  -  K^,)  +  (T  |y  -  P)(3y  -  b)  =  Ea  +  |  -  ba(T  -  Pa)  (3) 

where  E  =  relativistic  energy  density  =  U/V  ,  E  =  nonrelativistic  energy 
density,  and  where7*8 


T(DP/3T)y 
b  "  "(P  -  K^J 

a  T(3Pa/3T)v 
(Pa  -  Kj) 


(4) 


(5) 


(6) 


where  y  =  Grtineisen  parameter,  Cv  =  relativistic  heat  capacity  at  constant 
volume,  and  Cy  ■  nonrelativistic  heat  capacity  at  constant  volume,  given 
respectively  by 
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and  where 


V '  -''(f). 
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are  the  relativistic  and  nonrelativistic  values  of  the  bulk  modulus  respect¬ 
ively.  The  parameters  b  and  y  are  the  gauge  parameters  of  relativistic  ther¬ 
modynamics.  Equation  (2)  can  be  decoupled  into  two  independent  equations  by 
noting  that  E  and  P  are  related  by  the  Gibbs-Helmholtz  relation  as  follows9 


II 
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With  the  introduction  of  a  Lagrange  undetermined  multiplier  n  ,  equation  (11) 
can  be  rewritten  as 


1  +  v  f)E  +  f  - T  f)p  * 0 


Using  equation  (12)  allows  equation  (2)  to  be  decoupled  as  follows 
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M  =  f  +  1 


N  =  h  -  1 


^  =  <T  W-  b3v  W+  1  -  ^ 


Equation  (13)  and  (14)  are  the  ground  state  renormalization  group  equations  of 
relativistic  thermodynamics. 


For  a  solid  or  low  temperature  quantum  system  the  nonrelativistic ,  scalar 
state  equation  of  the  ground  state  is  assumed  to  have  the  following  form6-8 


r3  T~Si  . 

t  =  fc  +  t.T  + 
0  J 


pa  =  Pa  +  PaTJ  + 
°  J 


where  Ea  and  Pa  =  nonrelativistic  energy  density  and  pressure  respectively. 


Eo  and  Po  ■  nonrelativistic  zero-temperature  values  of  the  energy  density  and 
pressure  respectively,  E|  and  Pj  =  nonrelativistic  thermal  coefficients  for 
the  energy  density  and  pressure  respectively,  T  =  absolute  temperature  of  the 
system  (°K),  and  j  =  numerical  index  having  values  characteristic  of  the  type 


of  physical  system.  Note  that  Ua  =  VEa  and  Ua  =  VEa  where  Ua  =  zero-tempera¬ 
ture  value  of  the  unrenormalized  internal  energy. 


A  commonly  used  descriptor  of  the  thermal  state  equations  given  by  equa¬ 
tions  (20)  and  (21)  is  the  nonrelativistic  zero-temperature  value  of  the  Griin- 


eisen  parameter  that  is  defined  by 
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(j-1)  Ea  dVV  V 
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except  for  j  =  1.  Here  yQ  =  nonrelativistic  zero-temperature  value  of  the 
Griineisen  parameter,  and  V  *  volume  of  the  material  system.  When  j  =  1  , 

Ya  =  2/3  .  The  zero  temperature  value  of  the  nonrelativistic  bulk  modulus 
is  given  by  Ka  =  ndPa/dn  ,  where  n  =  N/V  =  number  of  moles  per  unit  volume, 
and  N  =  number  of  moles  of  a  substance. 


The  corresponding  relativistic  scalar  state  equation  will  be  written 
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P  =  P  +  P .  TJ  +  • 
°  J 


_J_  ±  ±(VE  ) 
(3-0  E.  dv(vtj} 


except  for  j  =  1  ,  when  Ej  =  E|  ,  where  EQ  and  PQ  =  relativistic  zero-temper¬ 
ature  energy  density  and  pressure  respectively,  E^  and  Pj  =  relativistic  thermal 


coefficients  for  the  energy  density  and  pressure'  respectively,  and  yd  =  relativ¬ 


istic  zero-temperature  Gruneisen  parameter.  The  relativistic  value-of  the  zero 


temperature  bulk  modulus  is  given  by  K0  =  ndPQ/dn  .  Note  that  UQ 


where 


U0  =  zero  temperature  value  of  the  renormalized  internal  energy.  Combining 


equation  (2)  with  the  state  equations  (20)  through  (25)  yields  the  following 


ground  state  equations 


Eo  -  3Gl  +  VPo 
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where  the  internal  energy  coefficients  are  given  by 


Ea  r  V  1 

j  , .  w  [  t  a  .  dV 

E  =  exP  ^  <^0  -  Yq)  — 

j  L 


and  where  P0,  KQ  and  y0  =  zero  temperature  values  of  the  relativistic  pressure, 
incompressibility  (  -VdPQ/dV)  and  Gruneisen  parameter  respectively,  and  P^,  K§ 
and  y®  are  the  corresponding  nonrelat ivist ic  values  of  these  quantities.  Eqs. 
(26)  and  (27)  are  a  set  of  coupled  nonlinear  differential  equations  for  PQ  and 


Equation  (26)  is  equivalent  to 
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The  trace  equation  for  radiation  in  matter  can  be  derived  by  a  perturbation 
technique  applied  to  equation  (2)  with  the  result7’ 9 


-  b  +  T  TF  '  bV  jl)Er  -  3t(T  If  -  P) 


-  3  1  +  Y  +  v 


i  -  'T  Tf  K  -  V(T  If  -  p)]  ‘  Y 


i 


m 


Vi 


$38 


,  t\> 


ft 


■1 


•M 


(T  IT  +  £rV  3V  +  Mt)Er  -  Br<T  3r  -  P)  “  K 


(T  Tt  +  V  J7  +  Nr)Pr 


h  6  (T  -  P) 
r  r  3T  ' 


-  pS 

K& 

< 

(39) 

=  0 

(40) 

HmA' 

moli' 

where 


f  =  n  -  b 
r  r 


\  =  (nr/3  -  y ) 


M  =  f  +  i 
r  r 


h  -  1 
r 


Local  gauge  and  scale  invariance  has  unified  continuum  mechanics  and  therm- 
modynamics.  ~8  In  particular,  it  has  been  shown  that  local  scale  invariance  for 
thermodynamics  requires  the  introduction  of  two  gauge  parameters  b  and  y  which 
must  be  determined  simultaneously  with  the  energy  density.  It  has  been  shown 
that  the  Lie  group  e±4>  is  the  scale  invariance  group  of  relativistic  thermo¬ 
dynamics  that  is  based  on  a  trace  equation. 7  The  group  U ( 1 )  of  phase  rotations 
e— i^  is  the  gauge  invariance  group  of  this  theory.8 


The  invariance  of  the  trace  equation  under  scale  transformations  of  the 


form  P  -*■  P'  =  Pe-T  and  E  ->■  E '  =  Ee  T  ,  and  under  phase  rotations  of  the  form 

P  -*  P '  =  Pe"^  ,  £-*■£'  =  Ee-^  ,  y  y'  =  ye^  ,  and  b  b '  =  be^  leads  to 
the  renormalization  group  equations  of  relativistic  thermodynamics.7’8  For 
instance,  gauge  invariance  for  phase  rotations  yields  the  following  renormal- 
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ization  group  equations  for'  the  comp1 ex  gauge  parameters 
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where  P  and  E  are  the  magnitudes  of  the  pressure  and  energy  density  respec¬ 
tively.  Symmetrization  gives  the  following  results  for  gauge  invariance  under 
phase  rotations8 
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Similar  equations  result  from  the  scale  invariance  condition  which  gives  dy/dP 
and  db/dE  for  changes  in  the  magnitudes  of  the  pressure  and  energy  density.7’ 

These  results  suggest  that  the  pressure,  energy  density  and  the  gauge  pa¬ 
rameters  may  themselves  be  intrinsically  complex  numbers  that  are  associated 
with  internal  phase  angles.  Accordingly  the  relativistic  trace  equation  (1) 
will  be  written  as 
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or  equivalently  as 
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where  U,  E,  P,  y ,  and  b  are  complex  number  representations  of  the  internal 
energy,  energy  density,  pressure,  and  the  gauge  parameters.  The  corresponding 
equation  for  radiation  in  matter  with  internal  phases  is  derived  from  equation 
(50)  to  be 
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where  Er,  Pr,  Br>  and  <5r  are  the  complex  number  generalizations  of  the  func¬ 
tions  that  appear  in  equations  (30)  through  (36) . 

This  paper  presents  a  theory  of  the  relativistic  thermodynamics  of  solids 
and  quantum  liquids  with  internal  phase.  The  renormalization  group  equations 
for  systems  with  internal  phase  are  derived,  and  a  procedure  for  solving  the 
complex  number  relativistic  trace  equation  is  presented  that  allows  the  deter¬ 
mination  of  the  internal  phase  angles  associated  with  the  pressure,  energy 
density,  and  gauge  parameters.  The  non-zero  values  of  the  phase  angles  rep¬ 
resent  a  spontaneously  broken  symmetry. 


2.  THERMODYNAMIC  STATE  FUNCTIONS  FOR  SYSTEMS  WITH  INTERNAL  PHASE.  In 
order  to  solve  the  complex  number  trace  equation  (50)  it  is  first  necessary  to 
determine  the  relations  between  the  complex  thermodynamic  state  functions  and 
to  determine  their  connection  to  the  internal  phase  angles.  This  will  be  done 
using  the  first  and  second  laws  of  thermodynamics.  The  complex  number  thermo¬ 
dynamic  state  functions  that  appear  in  equations  (49)  and  (50)  will  be  written 
in  terms  of  their  internal  phase  angles  as  follows 

U  =  Uei9u  (52) 

E  =  U/V  =  Eei0u  (53) 

P  =  Pe  l6P  (54) 

Y  =  Ye10'  (55) 

b=’bei9b  (56) 

where  0U,  0^,  0y,  and  0^  =  internal  phase  angles  of  the  internal  energy,  pres¬ 
sure,  Grlineisen  parameter,  and  b  gauge  parameter  respectively.  In  addition 
the  complex  number  entropy  will  be  written  as 

S  =  Sel9s  (57) 

where  0S  =  internal  phase  angle  of  the  entropy.  In  general  all  of  the  phase 
angles  are  functions  of  V  and  T.  The  quantities  U,  E,  P,  y ,  b,  and  S  are  the 
magnitudes  of  the  complex  thermodynamic  state  functions,  and  are  also  functions 
of  V  and  T. 

The  complex  number  bulk  modulus  is  obtained  from  equation  (54)  as  follows 
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Equation  (52)  immediately  gives  the  complex  number  heat  capacity  as 


c  =  l™-)  -  C  ei0CV 
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(62) 

where 
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Cy  cos  p  =  3U/3T 

(63A) 

C.r  sin  p  =  U  30  /3T 
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(63B) 
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30  30 
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(64) 

t3n  p  3U  T  3U 

3T  U  3T 

(65) 

Thus  the  renormalized  values  of  Cy  and  include  the  effects  of  the  internal 
phase  angles  0U  and  0p  respectively. 

The  relationships  between  the  various  state  functions  and  their  internal 
phase  angles  are  determined  from  the  First  and  Second  laws  of  thermodynamics 
which  can  be  written  for  matter  and  radiation  with  internal  phase  angles  as 
follows 


TdS  =  Te10s(dS  +  iSd@s)  =  dU  +  PdV 
or  equivalently  as 


T^  =  H+  p 
3V  dV 


(66) 
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Combining  equations  (52),  (53)  and  (57)  with  equations  (67)  and  (68),  and 
separating  into  real  and  imaginary  parts,  yields  the  following  equations 
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Squaring  and  adding  equations  (69)  and  (70)  gives 

4(f)2  *  f  *  *  A>) +  » 1  “•  -  v 


+  2PU  7777^-  sin  (0  -  0  ) 

0V  p  u 

Squaring  and  adding  equations  (71)  and  (72)  gives 


3 


& 


s  (I)  ♦  »v)  ■  (f )  ♦  »V 


The  Gibbs-Helmholtz  equation  for  matter  with  internal  phase  is  written  as 

—  =  T  —  -  p”  •  (75) 

3V  3T 

Using  the  Gibbs-Helmholtz  equation  allows  Equation  (67)  to  be  rewritten  as9 

il-il 

3V  3T  (76) 

which  allows  equations  (69)  and  (70)  to  be  rewritten  as 
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Squaring  and  adding  equations  (77)  and  (78)  gives 
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Also,  from  Maxwell's  relationship  it  follows  that" 
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From  which  it  follows  that 
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The  Gibbs-Helmholtz  equation  (75)  can  be  separated  into  real  and  imaginary 
components  as  follows 
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Equations  (84)  and  (85)  can  be  rewritten  in  terms  of  the  energy  density  as 
follows 
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Squaring  and  adding  equations  (84)  and  (85)  gives 
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Combining  equations  (73)  and  (79)  gives 
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Expanding  the  right  hand  side  of  equation  (88)  and  using  equation  (89)  gives 
the  following  simple  result 
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Similarly 
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The  three  basic  thermodynamic  potentials  will  now  be  considered. 


The  enthalpy  of  a  substance  with  internal  phase  is  written  as 


H  =  He  H  =  U  +  PV 


where  H  =  complex  number  enthalpy,  H  =  enthalpy  magnitude,  and  =  internal 
phase  angle  of  the  enthalpy.  Combining  equation  (92)  with  equations  (52)  and 
(53)  gives 
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The  differential  of  the  vector  enthalpy  is 
dH  =  e10H(dH  +  iHde  )  =  TdS  +  VdP 


which  yields 
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where  S  and  P  are  taken  to  be  the  two  independent  variables.  Combining  equa¬ 
tions  (96)  and  (97)  gives 
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while  combining  equations  (98)  and  (99)  gives 
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The  complex  free  energy  is  written  as 
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A  =  Ae10A  =  U  -  TS 


(102) 


where  A  =  complex  number  free  energy,  A  =  magnitude  of  the  free  energy,  and 
9  A  =  internal  phase  angle  of  the  free  energy.  Combining  equations  (52)  and 
(57)  with  equation  (102)  yields 


A2  =  (U  cos  0  -  TS  cos  6  )2  +  (U  sin  0  -  TS  sin  9  )2 
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The  differential  of  the  vector  free  energy  in  equation  (102)  is 


dA  =  ei0A(dA  +  iAd0  )  =  -  PdV  -  SdT 
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from  which  it  follows  that 
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The  complex  number  form  of  Che  Gibbs-Helmhol tz  equation  for  che  free  energy 
is  written  as9 
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which  gives  immediately 
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Combining  equations  (103),  (111)  and  (115)  gives 


A  |^  =  S[TS  -  U  cos  (9u  -  0g)] 


(116) 


The  complex  number  form  of  the  Gibbs  free  energy  is  given  by 


G  =  Gei0G  =  U  +  PV  -  TS  =  A  +  PV 


(117) 


where  G  =  complex  number  Gibbs  free  energy,  G  =  magnitude  of  the  Gibbs  free 
energy,  and  0q  =  internal  phase  angle  of  the  Gibbs  free  energy.  It  follows 
immediately  from  equation  (117)  that 
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which  gives 


G2  =  U2  +  P2V2  +  T2S2  +  2UPV  cos  (0  -  9  )  -  2 UTS  cos  (0  -  0  )  (120) 
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The  differencial  of  Che  veccor  Gibbs  funecion  in  equacion  (117)  is 


dG  =  e10G(dC  +  iGd0  )  =  -  SdT  +  VdP 
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(122) 


from  which  ic  follows  ChaC 
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Combining  equacion  (123)  and  (124)  gives 
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while  equacions  (125)  and  (126)  give 
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Further  relationships  can  be  obtained  from  the  vector  form  of  the  Gibbs- 
Helmholtz  equation  for  the  Gibbs  function  which  is  written  as 


A=G-PV=G-P 


(129) 
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When  the  T  =  0  limit  exists  (as  in  the  case  of  solids  and  quantum  liquids) 
for  matter  with  internal  phase,  the  following  equations  corresponding  to  equa¬ 
tions  (52)  through  (57)  can  be  written 


U  =  U  ei0u 
o  o 


(130) 


E  =  U  /V  =  E  e  u 
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(131) 
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So  =  0 
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where  6°  ,  0°  ,  and  0°  are  the  T  =  0  values  of  0  ,  0  ,  and  0  respectively, 

u  p  Y  u  p  Y 


Note  from  the  definition  of  b  in  equation  (5)  it  follows  that  bQ  =  0  so  that 

b  =0  also,  however  it  will  be  shown  later  that  0°  /  0  .  From  the  Third  law 
o  b 


of  thermodynamics  it  follows  that  Sq  =  0  and  Sq  =  0 


For  solids  and  quantum  liquids  one  can  take  the  T  =  0  limit  of  equations 
(84)  and  (85)  and  get 
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cos  0  -rr. - U  -rrr-  sin  0  =  -  P  cos  0 

u  dv  o  dv  u  o  p 


(136) 


du  de 

sin  0°  ■  +  U  -r~  cos  0°  =  -  P  sin  0° 

u  dv  o  dV  u  o  p 


(137) 


From  equations  (136)  and  (137)  one  gets  immediately 


tan  0  = 


dU  d0° 

n°  O  i  t  t  u  n° 

sin  8  — —  +  U  -777—  cos  0 

_ u  dV _ o  dV _ u 

dU  d0° 

3°  °  .1  u  ,.o 

cos  9  -777—  -  U  -777—  sin  0 

u  dV  o  dV  u 


(138) 
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Using  the  trigonometrical  formula  for  the  tangent  of  the  sum  of  two  angles 
allows  equation  (138)  to  be  rewritten  as 


0  = 


0°  +  <t> 
u  o 


where 


d0 


d0 


U  — -  u  T-y- 

.o  o  dV  o  dn 
tan  = 


dU 
_ c 

dV 


dU 
_ c 

dn 


d0 

E  n  ^ — 
o  dn 

dE 

c 

1  dn 


-  E 


P  cos  $  =  n  dE  /dn  -  E 

o  o  o  o 


P  sin  <t>  =  E  nd0  /dn 

o  o  o  u 


From  equation  (141)  it  follows  that 


<J>°  =  0 


when 


d0 

_ i 

dn 


=  0 


dU 

.  °  .  ir  ,  c 

$  =  ±  -r-  when  — — 

2  dn 


=  0 


In  general  for  a  T  =  0  system 


(140) 


(141) 


(141A) 
(14  IB) 


(142A) 


( 142B) 


(143) 


Figure  1  shows  the  density  dependence  of  9°  and  9°  for 
system  such  as  a  neutron  gas.  The  two  possible  signs 
(142B)  arise  in  systems  having  a  saturation  density  at 
according  to  the  signs  of  dUn/dn  and  dP^/dn  as  dUD/dn  • 
Figure  2  shows  the  density  dependence  of  9“  and  0°  for 
infinite  nuclear  matter  which  is  bound  at  a  saturation 


an  unbound  interacting 
that  occur  in  equation 
which  dUD/dn  =  0  , 

-*■  0  as  seen  in  Figure  2. 
a  system  such  as  N  =  Z 
density.  These  figures 
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show  the  effects  of  equations  (140)  through  (143).  In  general  0p  <  0U  in 
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the  high  density  limit  of  bound  or  unbound  quantum  systems.  From  equations 
(58)  and  (59)  it  follows  that 


dP 


i(0°  +  oo  ) 


K  =  n  -7-- —  =  K  e  P 
o  dn  o 


(144) 


where 


dP  \2  „/  d0° 

Ko  =  V  ln  df)  +  Po(n  d^ 


(145) 


tan  uj  =  P  (d0  /dn)/(dP  /dn) 
o  o  p  o 


K  cos  oj  =  n  dP  /dn 
ooo 


K  sin  uj  =  P  n  d0  /dn 
o  o  o  p 


(146) 

( 146A) 
(146B) 


The  complex  number  analogs  of  the  scalar  thermal  state  equations  given 
in  equations  (23)  and  (24)  are 


E  =  E  +  E  TJ  =  E  e  11  +  E  e 

o  j  o  j 


ieS 


TJ  =  Ee 


10, 


(147) 


ie 


P  =  P  +  P.T"'  =  P  +  P  e 


i9PTJ  =  Pei?P 


(148) 


where  Ej  and  Pj  =  magnitudes  of  the  thermal  components  of  the  energy  and 
pressure  respectively,  and  &J  and  0  I  =  phase  angles  of  the  thermal  components 
of  the  energy  density  and  pressure  respec t i ve 1 v .  From  equations  <,  1171  and 
(148)  it  follows  immediately  that 


E2  = 

E2 

+  2E  E. 

cos 

(,° 

-  ' ;  r  ■  + 

E2  r2j 

0  49' 

o 

o  J 

u 
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P2  = 

P2 

+  2P  P. 

cos 

(0° 

-  )Tj  + 

2  2  i 

P  T  J 

(150) 

0 

o  J 

p 

P 

j 

E  sin 

0° 

+  E. 

i  Ti 

sin  v  1 

tan 

0  = 

0 

u 

_ J_ 

u 

(151) 

u 

E  cos 

0° 

+  E. 

cos  0j  Tj 

o 

u 

J 

u 

66  E 


t'J 


P  sin  0°  +  P.  sin  0J  TJ 

tan  6  =  — - ^ - 1 - E — _ 

P  P  cos  0°  +  P.  cos  0J  TJ 
o  P  J  P 


(152) 


Note  chat  U, 


VEQ  and  Uj  =  VEj  . 


3.  GAUGE  PARAMETERS  FOR  SYSTEMS  WITH  INTERNAL  PHASE.  The  two  gauge 
parameters  that  appear  in  the  basic  trace  equation  (50)  are  ~  and  F.  The 
complex  number  Grtineisen  parameter  is  defined  as 


V  3P 


3P/DT 


Y  C„  3T  3E/3T 


=  ye 


i9\ 


(153) 


where  y  and  0Y  =  magnitude  and  phase  of  the  Griineisen  parameter  respectively. 
Combining  equations  (53)  and  (54)  with  equation  (153)  gives 


__  3P/3T  cos  p 
^  3E/3T  cos  u 


(8P/3T) 2  +  P2(36/3T)2 
(3E/3T) 2  +  E2(36u/3T)2 


(159) 


The  T  =  0  limit  of  equations  (156)  and  (157)  can  be  obtained  by  noting 
that  from  equations  (149)  and  (150)  it  follows  that 


=  j E.  cos  (03  -  9°)  T 

\3T/T  +  o  3  U  U 


3-1 


'  (!v)  =  jP.  cos  (03  -  0°)t3_1  - 

\3t/t-0  j  p  p 

and  from  equation  (151)  and  (152)  it  follows  that 


(160) 


(161) 


(E  irl 


E  — =■  jE.  sin  (9J  -  6°)  T 

1  at  ;T^o  J  u  u' 


/  30  \ 

(p  *n 


.  =  jP.  sin  (93  -  0°)  T3"1 

3T  /T_0  3  p  p 


(162) 


(163) 


It  then  follows  from  equations  (156)  and  (157)  and  (160)  through  (163)  that 


p  =  e3  -  0° 

o  u  u 


(164) 


y  -  03  -  0° 

o  p  p 

and  therefore  from  equation  ‘(158)  it  follows  that 

e°  =  e3  -  e3 

Y  p  u 

Finally  combining  equation  (159)  with  (160)  through  (163)  gives 


P. 

.J. 

E. 


(165) 


(166) 


(167) 


which  is  the  same  form  as  in  equation  (25)  for  the  scalar  thermal  state  equa¬ 
tion.  The  results  in  equations  (166)  and  (167)  can  be  obtained  directly  from 


the  T  “  0  limit  of  equation  (153)  by  using  the  complex  number  thermal  state 
equation  (147)  to  get 


=  lx  =  li  -i(0p  "  e^) 


°  E, 


r1 


(168) 


"I  J 

The  gauge  parameter  b  is  defined  as  follows 


T  — 

b  =  =  bei0b 

P  -  K„ 


(169) 


where  is  defined  in  equation  (58).  Equation  (169)  can  be  rewritten  as 

30 


b  = 


3P  39 

T  —  +  iPT 
3T  3T 


3P  30 

P  +  V  —  +  iPV 

3  V  3V 


T  —  +  iPT  — - -2- 
3T  3T 

3P  39t 

P  -  n  - - iPn  -r— ^ 

dn  3n 


(170) 


•(w+x) 


where  u  is  given  by  equation  (157)  and  x  Is  given  by 
30 


Pn 


tan  x 


3n 


„  3P 
P  -  n  ^ 


Comparing  equations  (169)  and  (171)  gives 


(171) 


(172) 


eb  =  u  +-x 


(173) 


b  = 


(TDP/3T)2  +  P2(T30p/3T)2 

?  ?  9 


T3P/3T 
P  -  n3P/3n 


sec  u 
sec  y 


(174) 


The  T  =  0  limit  of  equation  (173)  is 


0,0 

V  +  X 


eJ  -  e°  +  x° 


(175) 


where 


de 


P  n  . 
o  o  dn 
tan  x  =  - 


dP 


(176) 


p  ~  n  , 
o  dn 


while  the  T  =  0  limit  of  b  is  obtained  from  equation  (174)  to  be  b  »  0. 


In  the  past  two  sections  the  relationships  between  the  various  phase  an¬ 
gles  and  amplitudes  of  the  thermodynamic  state  functions  have  been  presented. 
In  the  next  section  a  method' of-  calculating  the  phase  angles  and  amplitudes 
will  be  presented. 


4.  RENORMALIZATION  GROUP  EQUATIONS  FOR  THE  GROUND  STATE  OF  PHASE  MATTER. 
The  phase  angles  and  magnitudes  of  the  complex  number  thermodynamic  state  func¬ 
tions  are  calculated  from  the  solution  of  the  vector  renormalization  group  equa 
tion  (50) .  Combining  equation  (50)  with  equations  (52)  through  (56)  gives 

(177) 


1  -  bel9b  +  T  -  bel6b  V^EeL0u  -  3(1  + 


i®Y  ,r  9 
ye  y  +  V  —  -  ye 


-  »a 


where 


(178) 


Equation  (177)  can  be  separated  into  real  and  imaginary  parts.  The  real  part 
is  given  by 


cos  0 


u(l- 


30 


bcos  0b  +  TW*b  cos  ebvw  f  b  Sin  ebV3V^E 


(179) 


30  30 

-  sin  9 ..I  -  b  sin  8b  +  T  ^  -  b  sin  9bVW  "  b  cos  ^V-^E 


30 


3  cos  0p^l  +  y  cos  9y  +  v  ~  y  cos  T'|f  +  Y  sin  9y  T  rfil*P 


/  90  3  30 

+  3  sin  0  fy  sin  9.^  +  V  -  y  sin  0^T  —  -  y  cos  e^T-^jP  =  ip" 


672 


The  imaginary  part  of  equation  (177)  is  written  as 


sin  0 


■(*- 


b  cos  8b  +  T  Jr  ~  b  cos  eb  v  W  +  b  sin  eb  v 


06  V 

— -|I 

3V  J 


(180) 


30  .  30  \ 

+  cos  0..|-  bsin0b  +  T^-b  sin  0^  -  b  cos  V^jE 

/  3  3  30  \ 

-  3  sin  0^1  +  Y  cos  0y  +  V  —  -  y  cos  +  Y  sin  6^-^jP 

/  36  3  36  \ 

-  3  cos  0Jy  sin  0^  +  V  -  y  sin  -  y  cos  ©^T-^-Jp  =  0 

Equations  (179)  and  (180)  can  be  further  simplified  by  introducing  equations 
(86)  and  (87)  and  their  two  corresponding  Lagrange  indeterminate  multipliers 
n  and  t  to  get 


30 

n(cos  0  -  sin  0  V -r~  +  cos  0 

u  u  3V 


V  — 
u  3V/ 


(181) 


/  36  3  \ 

+  n(cos  0p  +  sin  OpT^  -  cos  9pT— jP  =  0 


30 


(a\j  v 

sin  0  +  cos  0  V  Tr77~  +  sin  0  V  — )E 

u  u  3V  u  3V/ 


(182) 


/  39n  3  \ 

+  t  | sin  0p  -  cos  OpTg^  -  sin  9pT  —  jP  =  0 

Combining  equation  (179)  and  (180)  with  the  constraints  in  equations  (181)  and 
(182)  gives  the  following  four  independent  partial  differential  equations 


W  +  £v  ^7  +  M)E  ' 


(183) 


<"T  "v  Jv  +  R)E  '  0 


(wT  -k  +  xV  Tv  +  Y)p  ‘  0 


<sTff  +  zV^J  +  1>p  '  0 


(184) 


(185) 


(186) 
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In  the  limit  of  0 .  -*■  0  one  has 
l 


t  =  1 


m  =  0 


w  =  j  -  Y 


s  =  0 


(199) 


f  =  n  -  b 


q  =  0 


x  =  1 


z  =  0 


M  =  1  +  n  -  b 


R  =  0 


V  =  1  -  j  +  y 


1  =  0 


which  agree  with  equations  (13)  through  (18). 


Equations  (183)  through  (186)  are  the  renormalization  group  equations  that 
describe  relativistic  thermodynamic  systems  having  internal  phase  angles.  There 
are  ten  unknown  quantities  in  equations  (183)  through  (186):  E  ,  9U  ,  P  ,  0p  ,  y  , 
9-y  i  b  >  0[)  i  n  i  T  .  The  ten  equations  required  to  determine  these  quantities 
are:  the  four  renormalization  group  equations  (183)  through  (186),  the  two  equa¬ 
tions  C_158)  and  (159)  that  define  y  ,  the  two  equations  (173)  and  (174)  that 
define  b  ,  and  the  two  constraint  equations  (181)  and  (102).  Equations  (183) 
through  (186)  can  be  derived  from  a  Lagrangian  formalism  in  a  manner  similar 
to  that  in  the  accompanying  paper. 


5.  RENORMALIZATION  GROUP  EQUATIONS  FOR  RADIATION  IN  PHASE  MATTER.  In  a 
manner  similar  to  equations  (52)  through  (56)  the  state  functions  and  gauge 
parameters  for  radiation  that  appear  in  equation  (51)  are  written  as 
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6  -  6  el9Sr 
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(206) 


where  9  =  internal  phase  angle  of  the  radiation  internal  energy 


9  =  internal  phase  angle  of  the  radiation  pressure 
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9  =  internal  phase  angle  of  the  radiation  GrUneisen  gauge  parameter 
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**  internal  phase  angle  of  the  radiation  gauge  parameter 
0^  =  internal  phase  angle  of  <5^  gauge  function 


0D^  ■  internal  phase  angle  of  B  gauge  function 
pr  r 


In  general  all  of  the  phase  angles  and  magnitudes  that  appear  in  equations 
(200)  through  (206)  are  functions  of  V  and  T.  Also  all  of  the  equations  that 
appear  in  Sections  2.  and  3.  are  also  valid  for  radiation  and  can  be  carried 
over  into  the  present  calculation  by  adding  a  subscript  ”r". 


The  complex  number  form  of  the  functions  that  appear  in  equation  (51)  can 
be  written  in  a  form  analogous  to  that  in  equations  (31)  through  (35)  as  follows 
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Combining  equations  (210)  and  (201)  through  (203)  gives 
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(212) 


(213) 
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It  remains  to  show  how  the  radiation  equation  (51)  can  be  decomposed  in¬ 
to  four  radiation  equations  which  combined  with  the  defining  relations  in  equa¬ 
tions  (200)  through  (228)  can  be  used  to  calculate  the  eight  quantities  Er  , 
eur  *  pr  ’  °pr  ’  Yr  »  ®yr  ’  br  *  anci  °br  •  First  note  that  equation  (51)  can 
be  rewritten  as 


(l  -  bei9b  +  T  -  bei0b  v^)Ere10ur  -  qrei9^T  ^(Pei0P)  -  PeiSp‘ 


iSnl  (231) 


-  3  |  |l  +  yei0Y  +  V  ~  -  Yel0YT^jprel0pr  -  6rel0br  T  ^(pel9p)  -  Pe10P  J  =  v® 

The  simplification  of  equation  (231)  can  be  realized  by  noting  that  the  complex 
Gibbs-Helmholtz  equation  for  radiation 


3U  3P 

r  r  — 

3V~~  ~  T  3T  Pr 


(232) 


yields  the  following  two  constraint  equations  similar  to  equations  (181)  and 
(182) 
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+  x  (sin  0  -  cos  0  T  ~  sin  _  T  — ) P  -  0 
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where  two  radiation  Lagrange  indeterminate  multipliers,  nr  and  xr  ,  are  intro¬ 
duced.  Separating  equation  (231)  into  real  and  imaginary  parts  and  using  the 
constraint  equations  (233)  and  (234)  yields  the  following  four  independent 
partial  differential  equations 
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where  fj  '  ,  0gr  ,  <5r  ,  and  0gr  are  given  by  equations  (223)  , 
(220)  ,  and  where 
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Equations  (235)  through  (238)  are  the  renormalization  group  equations  for  radi¬ 
ation  with  internal  phase  angles.  Setting  0^  =  0  in  equations  (240)  through 
(251)  reduces  equations  (235)  through  (251)  to  equations  (39)  through  (44). 

The  radiation  renormalization  group  equations  (235)  through  (238)  can  easily 
be  derived  from  a  Lagrangian  formalism  in  a  manner  similar  to  that  given  in  the 
accompanying  paper. 

6.  GROUND  STATE  OF  SOLIDS  AND  LOW  TEMPERATURE  QUANTUM  LIQUIDS  WITH  -IN¬ 
TERNAL  PHASE.  This  section  considers  the  calculation  of  the  energy  density, 
pressure,  and  internal  phase  angles  associated  with  the  relativistic  state  equa¬ 
tion  of  the  form  given  in  equations  (147)  and  (148).  The  complex  number  unaloc- 
of  equations  (26)  through  (28)  are  written  as 


E  -  3 [ ( 1  +  y  )P  -  K  ]  =  ta 
o  o  o  o  o 


(252) 
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where  PQ  ,  KQ  ,  and  w0  are  given  by  equations  (139),  (145),  and  (146)  respec¬ 
tively.  Equation  (252)  can  also  be  written  as 


d2E 
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(256)  . 


Equations  (252)  and  (253)  and  the  constraint  equations  (136)  and  (137) 
must  be  solved  for  EQ  ,  0°  ,  PQ  ,  0°  ,  yQ  ,  and  0°  .  Combining  equation  (256) 
with  equations  (131)  and  (132)  and  taking  the  real  and  imaginary  parts  yields 
the  following  two  equations 
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The  angle  0^  enters  the  calculations  through  the  relation  (254) 


U.  U.  ..j 
_1  =  _1  e10- 

ua  ua 

J  3 


U  =  exp^(j  -  1 )  /  (y^  -  Yoe10Y)  ^ 


from  which  equations  (268)  and  (269)  follow  immediately, 
use 
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Equivalently  one  can 


(271) 


i  m  du-  de 

— i-j— - 1  +  iv  — — 

j-l\U  dV  1V  dV 


to  obtain 


which  immediately  give  equations  (268)  and  (269)  respectively.  From  equation 
(271)  it  also  follows  immediately  that 


(276) 


(277) 


Note  that  when  j  =  1  :  Uj  *  3/2  NR  =  ,  and  Pj  =  Y03/2  nR  .  The  simultaneous 

solution  of  equations  (257),  (258),  (261),  and  (262)  along  with  the  constraint 
equations  (136)  and  (137)  give  £0  ,  9^  ,  P0  >  »  Y0  >  anc*  •  Then  equations 

(272)  and  (273)  give  Uj  and  0^  ;  Pj  is  obtained  from  equation  (277),  and  finally 
is  obtained  from  equation  (276).  In  this  way  all  of  the  elements  of  the  re¬ 
normalized  state  equations  (147)  and  (148)  can  be  determined. 


7.  EXCITED  STATES  OF  SOLIDS  AND  LOW  TEMPERATURE  QUANTUM  LIQUIDS  WITH  IN¬ 
TERNAL  PHASE.  The  complex  number  state  equations  for  the  excited  states  of  sol¬ 
ids  and  low  temperature  quantum  liquids  are  written  in  analogy  to  the  ground 
state  equations  (147)  and  (148)  as  follows 
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where  the  T  =  0  equivalents  of  the  quantities  in  equations  (200)  through  (206) 
are 
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where  9°r  ,  0pr  ,  0°r  ,  and  9°r  are  the  T  =  0  values  of  9ur  ,  0pr  ,  dyT  ,  and 

0^r  respectively; -and  where  0^r  and  0^r  are  the  phase  angles  associated  with 
the  thermal  components  of  the  radiation  energy  and  pressure  respectively. 

The  T  =*  0  and  components  of  the  complex  number  equation  (51)  are  re¬ 
spectively8 
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The  T  =  0  radiation  bulk  modulus  is  given  by 
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The  functions  and  T^  that  appear  in  equation  (288)  are  given  by 
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In  order  to  decouple  the  complex  number  equation  (288)  into  two  real 
equations  one  must  first  rewrite  the  expressions  for  a  ,  8  ,  and  1>jr  .  From 
equation  (298)  it  follows  that 
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where  yQ  ,  PQ  ,  and  x0  are  given  by  equations  (272),  (139),  and  (267)  respec¬ 
tively,  and  Tq  is  given  by 
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From  equation  (299)  it  follows  that 
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where  K0  and  w0  are  given  by  equations  (145)  and  (146)  respectively.  The  ex 
pression  1>jr  is  obtained  from  equations  (296),  (132),  (133),  (144),  and  (283) 
to  be 
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Using  equations  (289)  through  (321)  allows  the  second  radiation  equation  (288) 
to  be  separated  into  real  and  imaginary  parts  as  follows 
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Equations  (309),  (310),  (322),  and  (323)  are  the  four  renormalization  group 


equations  needed  to  solve  for  the  four  unknown  radiation  functions  Eor  ,  S°r 


Y  ,  and  0°  .  The  radiation  analogs  of  equations  (136)  and  (137)  relate 


9ur  ’  9pr  ’  ^or  *  anci  por  •  Equations  (307)  and  (308)  relate  Ej r  to_yor 
These  flight  equations  cari  be  used  to  solve  for  the  eight  quantities  Eor  , 
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8.  PROCESSES  IN  PHASE  MATTER  AND  RADIATIO.’  The  internal  phase  angles  of 
matter  and  radiation  allow  an  extended  interpretation  of  the  types  of  processes 
that  can  occur  in  these  systems.  Consider  for  example  the  change  in  complex 
entropy  given  by  equation  (57)  as 
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From  equation  (327)  it  is  clear  that  dS  =  0  cannot  represent  a  physical  pro¬ 
cess,  as  the  real  and  imaginary  parts  both  set  equal  to  zero  would  determine 
two  V  =  V(T)  curves,  and  these  conditions  would  perhaps  hold  jointly  only  at 
an  intersection  point.  In  fact  two  distinct  processes  can  be  obtained  from 
equation  (327) 


dS  =  0  adiabatic  process 


(328) 


d8s  =*  0  entropy  isophase  process 


(329) 


Thus  an  adiabatic  process  corresponds  to  a  rotation  of  the  entropy  vector  S 
in  internal  phase  angle  space.  Processes  may  occur  in  nature  such  that  changes 
in  volume  and  temperature  cause  rotation  of  the  entropy  and  internal  energy 
vectors.  For  the  case  of  constant  entropy  magnitude  (dS  =  0)  the  heat  incre¬ 
ment  is  obtained  from  equation  (327)  to  be 


dQ  =  iTSdO 


(330) 


For  this  adiabatic  process  the  conservation  of  energy  is  written  as 


iTSdG  =  dU  +  PdV 
s 


(331) 


This  results  in  equations  (69)  through  (91)  with  3S/3V  =  0  and  3S/3T  =  0 


For  the  case  of  constant  magnitude  of  the  internal  energy,  dU  =  0  and  the 
rotation  of  the  internal  energy  vector  is  given  by 


dU  =  iUde 


(332) 


and  equation  (66)  gives 


TdS  =  iUd0  +  PdV 
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(333) 


This  results  in  equations  (69)  through  (91)  with  3U/3V  =  0  and  3U/3T  =  0  . 

For  the  case  when  both  dS  =  0  and  dU  =  0  ,  corresponding  to  rotations  of  both 
the  entropy  and  internal  energy  vectors,  one  has 


iTSdO  =  iUde  +  PdV 
s  u 


(334) 


This  results  in  equations  (69)  through  (91)  with  3U/3V  =  0  ,  3U/3T  =  0  , 

3S/3V  -  0  ,  and  3S/3T  =  0  .  Similar  results  apply  for  the  thermodynamic  poten¬ 
tials  H  ,  A  ,  and  G  . 


In  general  a  process  will  result  in  a  combined  stretch  and  rotation  of  the 
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thermodynamic  functions  U  ,  P  ,  S  ,  H  ,  A  ,  and  G  ,  and  the  time  rate  of  change 
of  a  thermodynamic  quantity  will  include  an  angular  velocity  of  the  internal 
phase  angles.  For  example,  the  rate  of  pressure  change  is  given  by 


dP  idj  dP 

*  e  V 


(335) 


Thus,  even  if  the  magnitude  of  the  pressure  were  held  fixed-  the  pressure  vector 
can  rotate  internally. 


Another  possibility  is  the  transfer  of  external  angular  momentum  to  inter¬ 
nal  angular  momentum  and  vice  versa.  Thus  external  rotation  may  in  some  cases 
be  coupled  to  the  rotation  of  the  internal  phase  angles.  In  fact,  the  Lagran- 
gian  of  a  rotating  body  may  be  of  the  form 


L  -  i  I  I  U2  +  i  I  I.OK+)  a  .ojtu.  -  v(e  ,6.) 
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where  Ie  and  me  =  external  moment  of  inertia  and  angular  velocity  respectively, 
1^  and  WjL  =*  internal  moments  of  inertia  and  angular  velocity  respectively,  and 
where  0e  and  0^  =  external  and  internal _ angles  respectively.  The  0^  consists 
of  8p  *  9U  »  9S  »  etc,  and  0^  includes  0p  ,  0U  ,  0g  and  so  on.  The  internal 
moments  of  inertia  In  ,  I..  ,  Is  ,  etc  are  associated  with  the  internal  angle 
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coordinates  of  pressure,  internal  energy,  entropy.,  2tc.  It  is  expected  that 


for  such  a  system  macroscopic  energy  transfers  would  occur  between  the  internal 
and  external  dynamical  systems.  Such  transfers  may  account  for  the  glitches 
that  appear  in  the  spin-down  of  pulsars.  Similar  Drocesses  have  been  suggested 

. _  ....  ..1 -I 1  _  C  c I. _ - _  5 10—13 


to  occur  at  the  level  of  fundamental  particles. 


9.  CONCLUSION.  The  local  gauge  invariance  of  relativistic  thermodynamics 
suggests  the  possibility  that  the  thermodynamic  state  functions  can  be  represent¬ 
ed  as  complex  numbers  whose  imaginary  parts  are  related  to  phase  angles  in  an 
internal  space  associated  with  all  interacting  systems  of  matter  and" radiation. 
Due  to  vacuum  interactions,  bulk  matter  solids  and  quantum  liquids  are  coherent 
in  internal  space.  The  phase  angles  and  magnitudes  of  the  thermodynamic  state 
functions  are  calculated  from  a  solution  of  the  renormalization  group  equations 
which  represent  the  mathematical  description  of  the  interaction  of  matter  and 
radiation  in  matter  with  the  vacuum  state.  The  internal  phase  angles  are  ex¬ 
pected  to  manifest  themselves  in  the  state  equations  of  matter  and  radiation  in 
matter.  In  some  cases  a  transfer  of  energy  may  occur  between  external  rotations 
and  the  rotations  of  the  internal  phase  angles.  The  internal  phase  angles  axe 
expected  to  affect  the  equations  of  motion  of  classical  and  quantum  systems,  and 
should  affect  the  equilibrium  configurations  of  atomic  nuclei  and  the  stars. 


The  renormalized  ground  state  of  a  relativistic  thermodynamic  solid  or 
quantum  liquid  is  associated  with  a  broken  symmetry  manifested  by  the  nonzero 


values  of  the  internal  phases  0p  , 


etc.  that  are  obtained  as  solutions 


&3SSS 


to  the  relativistic  trace  equation.  A  symmetrical  ground  state  would  have 
0p  =  0  ,  0y  =  0  ,  0S  =  0  ,  etc.  This  broken  symmetry  should  be  associated  with 
massive  gauge  bosons  that  are  connected  with  the  excited  states  of  the  internal 
phases  of  bulk  matter,  i.e.,  internal  spin  waves  of  the  pressure  and  entropy. 
Similar  broken  symmetries  are  common  in  atomic  and  nuclear  systems.14  As  a 
practical  application  of  these  ideas  one  can  conceive  of  a  bulk  matter  vacuum- 
induced  broken  symmetry  thermodynamic  engine.  Such  an  internal  phase  engine 
would  utilize  the  broken  symmetry  nature  of  the  ground  state  of  bulk  matter  in 
a  manner  analogous  to  the  broken  symmetry  ferromagnetic  state  of  an  iron  arma¬ 
ture  in  an  electric  motor. 
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ABSTRACT .  Matter  and  radiation  Lagrangians  are  developed  from  which  the 
renormalization  group  equations  of  locally  gauge  invariant  relativistic  ther¬ 
modynamics  can  be  obtained  by  the  Euler-Lagrange  equations.  The  Noether  cur¬ 
rent  tensor  and  conservation  equations  of  relativistic  thermodynamics  can  be 
derived  from  these  Lagrangians.  These  Lagrangians  exhibit  a  minimum  value 
when  expressed  in  terms  of  the  fractal  dimension  of  a  physical  system,  and 
are  locally  symmetric  about  this  minimum  value.  This  suggests  that  all  matter 
and  radiation  in  matter  is  fractal  in  nature.  Gases,  liquids,  solids,  quantum 
liquids,  and  the  mechanical  waves  that  propagate  in  these  systems,  have  fractal 
properties.  Equations  for  calculating  the  fractal  dimensions  of  matter  and  ra¬ 
diation  in  matter  are  presented,  and  general  expressions  for  the  void  ratio  of 
gases,  condensed  matter,  and  radiation  are  derived.  These  results  will  have 
applications  to  matter  and  radiation  at  high  densities  such  as  occur  in  neutron 
stars,  nuclear  explosions,  and  the  interaction  of  directed  energy  beams  with 
matter. 


1.  INTRODUCTION.  Lagrangian  formulations  of  the  theory  of  continuous 
systems  are  common  in  the  classical  mechanics  of  particles  and  fields.1'2  But 
it  is  in  the  quantum  theory  of  fields  that  the  Lagrangian  formalisms  have  ex¬ 
hibited  their  unique  power  to  describe  new  physical  effects  in  addition  to 
yielding  the  dynamical  equations  of  motion.  ~5  For  instance,  the  properties 
of  a  chiral  Lagrangian  yield  the  left-right  asymmetries  of  the  electroweak 
force.6’7  The  spontaneously  broken  symmetry  of  a  Lagrangian  gives  rise  to  such 
diverse  phenomena  as  mass  generation  of  gauge  bosons,  the  existance  of  Goldstone 
bosons,  the  ferromagnetic  ground  state,  the  Meissner  effect  for  superconductors, 
and  many  other  subtle  effects.6’7  These  results  suggest  that  other  locally 
gauge  invariant  systems,  such  as  relativistic  thermodynamics,  may  have  a  simple 
Lagrangian  description. 

A  set  of  relativistic  thermodynamic  renormalization  group  equations  for 
the  ground  and  excited  states  of  matter  and  radiation  has  been  derived  using 
the  local  scale  (gauge)  invariance  of  relativistic  thermodynamics . 9 ’ 9  These 
renormalization  group  equations  are  partial  differential  equations  for  the  en¬ 
ergy  and  gauge  parameters,  and  are  similar  in  form  to  the  Callan-Svmanzik  equa¬ 
tions  of  relativistic  quantum  field  theory.7  These  equations  are  derived  from 
a  relativistic  trace  equation  that  accounts  for  the  vacuum  interactions  of  mat¬ 
ter  and  radiation  in  a  four-dimensional  formalism.10  The  trace  equation  is  lo¬ 
cally  gauge  invariant  under  the  U ( 1 )  group  in  the  sense  that  the  values  of  the 
gauge  transformation  functions  depend  on  the  local  density  and  temperature  of 
a  system.9 
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This  paper  develops  a  Lagrangian  formulation  of  relativistic  thermody¬ 
namics  that  is  shown  to  be  equivalent  to  the  renormalization  group  equations. 
The  Lagrangian  density  can  be  used  to  determine  the  fractal  dimension  (Haus- 
dorf  number)  of  bulk  matter  and  radiation  in  bulk  matter.  Fractal  matter  sys¬ 
tems  are  discussed  extensively  in  the  literature.11-17  For  relativistic  ther¬ 
modynamics,  the  fractal  dimension  is  related  to  the  state  equation  of  a  system 
and  its  deviation  from  the  homogeneous  case  is  due  to  the  vacuum  interactions 
of  matter  and  radiation.  In  this  paper  the  Gibbs-Helmholtz  equation  is  used 
to  estimate  the  void  ratios  of  fractal  matter  and  radiation. 

2.  RENORMALIZATION  GROUP  EQUATIONS  FOR  A  FRACTAL  GROUND  STATE.  The  lo¬ 
cally  gauge  invariant  interaction  of  the  vacuum  state  with  uniform  bulk  matter 
and  radiation  is  described  by  a  relativistic  trace  equation.10  The  question 
arises,  however,  whether  the  vacuum  interaction  will  produce  a  uniform  system 
of  matter  and  radiation  or  whether  it  will  result  in  a  fractal  state.  This 
question  can  be  answered  by  developing  the  renormalization  group  equations  for 
the  fractal  states  of  matter  and  radiation.  The  fractal  analog  of  the  relativ¬ 
istic  trace  equation  is  written  as10 


“ +  T(S)PV  - DV  a?' <pv>u 


U3  +  Ti 


I  dU_ 

\dT 


PaV 
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where  D  =  fractal  dimension  =  Hausdorf  number.11  17  The  vacuum  state  (space- 


time)  has  D  =  3  to  a  very  high  degree  of  accuracy.18  On  the  other  hand,  matter 


and  radiation  in  matter  need  not  have  D  =  3  ,  and  in  general  the  fractal  dimen¬ 
sion  of  a  system  will  depend  on  volume  and  temperature,  D  =  D(V,T).  In  equation 
(1),  U  =  relativistic  internal  energy,  P  =  relativistic  pressure,  T  =  absolute 
temperature,  V  =  volume  of  substance,  and  Ua  and  Pa  =  corresponding  nonrelativ- 
istic  internal  energy  and  pressure.  Throughout  this  paper  the  index 
refer  to  nonrelativistic  calculations. 


For  a  fractal  system  with  Hausdorf  number  D,  equation  (1)  becomes 
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and  where  n  =  Lagrange  multiplier  given  by 


V  —  -  yT  —  +  (y  +  1)P 
n  3V  T  3T  ^  ‘ 


P  -  T  — 
3T 


Equations  (10)  through  (17)  reduce  to  equation  (25)  through  (32)  of  Reference  9 


for  the  case  D  =  3  .  Thus  f  ,  h  ,  M  ,  and  N  for  the  fractal  ground  state  are 
now  explicit  functions  of  the  fractal  dimension  D,  and  therefore  E  ,  P  ,  and  y 
are  also  explicit  functions  of  D.  Finally,  it  will  be  assumed  that  f  /  0  and 
h  i*  0  so  that  equations  (10)  and  (11)  can  be  rewritten  as 
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For  a  solid  or  low  temperature  quantum  system  the  nonrelativistic  state 


equation  of  the  ground  state  is  assumed  to  have  the  following  form 
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where  Ea  and  Pa 


nonrelativistic  energy  density  and  pressure  respectively. 


E0  and  PQ  =  nonrelativistic  zero-temperature  values  of  the  energy  density  and 


pressure  respectively,  Ej  and  Pj  =  nonrelativistic  thermal  coefficients  for 
the  energy  density  and  pressure  respectively,  T  =  absolute  temperature  of  the 


system  (°K),  and  j  =  numerical  index  having  values  characteristic  of  the  type 
of  physical  system.  A  commonly  used  descriptor  of  the  thermal  state  equations 
given  by  equations  (20)  and  (21)  is  the  nonrelativistic  zero-temperature  value 
of  the  GrUneisen  parameter  that  is  defined  by 
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Note  that  in  the  derivation  of  equation  (27)  it  is  assumed  that  any 
temperature  dependence  of  D  in  the  for  D  =  D0  +  DjTd  +  •••  can  be  n 
and  that  essentially  Dj  =  0  .  If  this  is  not  assumed,  an  additiona 
-D-if  (1  +  y  -  K_1  has  to  be  inserted  into  the  left  hand  side  of 
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Equivalent  forms  of  equation  (29)  are 
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Thus  in  general  E0  =  E0(Ea  ,  ya  ,  D0  ,  V)  and  y0  =  y0(Ea  *  '<o  »  Do  >  v)  *  It; 
is  possible  that  originally  the  unrenormalized  state  is  fractal  in  nature  so 
that  Eq  =  Ea(Dg  ,  V)  and  y§  =  y§(Dq  »  v)  •  If  in  equation  (29)  one  takes 

E0  'v  n°°  and  Ea  ^  na°  ,  where  a0  =  adiabatic  index,  and  yD  =  constant,  one  gets 
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It  then  follows  that 
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A  general  expression  for  the  void  ratio  in  a  fractal  ground  state  can  be 
obtained  from  the  Gibbs-Helmholtz  equation10 
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Equation  (36)  will  be  written  in  finite  difference  form  corresponding  to  a 
change  of  fractal  dimension  from  the  uniform  D  =  3  case  to  the  general  case  of 
arbitrary  fractal  dimension  as  follows 
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where  the  notation,  E(D)  =  energy  density  associated  with  a  fractal  dimension 
D,  and  E(3)  and  P ( 3 )  =  energy  density  and  pressure  respectively  for  the  homo¬ 
geneous  case  of  D  =  3  ,  is  introduced  for  calculating  void  ratios.  The  T  =  0 
limit  of  equation  (37)  is  given  by 
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Note  that  the  energy  density  for  a  fractal  ground  state  is  greater  than  that 
of  the  homogeneous  state . 1 9 


3 .  RENORMALIZATION  GROUP  EQUATIONS  FOR  FRACTAL  RADIATION  IN  FRACTAL 
MATTER.  The  renormalization  group  equation  for  fractal  radiation  in  fractal 
matter  can  be  written  as  a  simple  extension  of  the  corresponding  equation  for 
homogeneous  matter,  using  the  same  notation  as  in  equation  (70)  of  Reference  9, 
as  follows 
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where  the  fractal  dimension  of  the  radiation  in  matter  is  Dr  =  D  -  dr  ,  and 
where  df  >  0  is  the  incremental  change  in  the  fractal  dimension  due  to  the 


presence  of  radiation  in  the  system,  and  where 
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The  functions  yr  and  br  are  the  radiation  gauge  parameters,  and  =  radiation 
bulk  modulus.  Note 'that  |3r  ,  <$r  ,  and  dr  are  generally  small  quantities,  while 
Yr  j  br  ,  and  Dr  refer  to  the  radiation  itself  and  are  not  small  quantities. 

For  D  =  3  and  dr  =  0  ,  equation  (39)  reduces  to  equation  (70)  of  Reference  9. 

Equation  (39)  can  be  decoupled  into  two  independent  radiation  renormaliza¬ 
tion  group  equations  as  follows 
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where  nr  =  Lagrange  multiplier.  For  the  case  D  -  3  and  dr  -  0  ,  equations 
(45)  through  (51)  reduce  to  equations  (74)  through  (80)  of  Reference  9.  It 
will  be  assumed  that  fr  ±  0  and  hr  ^  0  so  that  equations  (45)  and  (46)  can 
be  written  as 
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The  energy  density  and  pressure  for  radiation  in  solids  and  quantum  liquids 
is  written  as9 
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E.  and  P.  =  relativistic  thermal  coefficients  for  the  radiation 
'r  energv  density  and  pressure  respectively 


The  zero  temperature  value  of  the  radiation  Grlineisen  parameter  is  obtained 
from  equations  (44)  and  (54)  through  (57)  to  be 
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The  zero  temperature  values  of  the  nonrelativistic  and  relativistic  radiation 

bulk  modulus  is  written  as  Ka  =  ndPa  /dn  and  K  =  ndP  /dn  respectively. 
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When  radiation  is  present  in  a  fractal  solid  or  low  temperature  quantum 
liquid,  the  T  =  0  fractal  dimension  of  the  radiation  will  be  written  as 
Dor  =  Dq  -  dor  where  dor  >  0  is  the  small  change  in  fractal  dimension  associated 
with  the  addition  of  radiation  to  the  material  system.  The  excitation  equations 
for  such  a  system  are  obtained  from  equations  (45)  and  (46)  and  are  an  extension 
of  equations  (104)  and  (105)  of  Reference  9.  They  are  written  as 
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In  the  derivation  of  equation  (60)  it  is  assumed  that  one 
ature  dependence  of  Dr  of  the  form  Dr  =  Dor  +  DjrT3  +  ••• 

dr  =  D  -  Dr  =  D0  -  Dor  +  (Dj  -  D1r)TJ  +  • • •  =  dor  +  djrTJ 

D;  =  0  ,  D^r  =  0  ,  and  d:r  =  0  .  If  this  is  not  the  case 
additional  term  +d  j  r  [  ( I  +  >0)1’  -  Kq  ]  lias  to  be  inserted 

of  equation  (60).  For  this  case  both  dor  and  djr  must  be 
(59)  can  be  rewritten  as 


can  neglect  a  temper- 
(or  equivalently, 

+  • • • )  and  that 

and  D.j  #  0  ,  then  an 
into  the  left  hand  side- 
determined.  Equation 


706 


(65) 


I 


ft 


d2E  dE 

Don  - 11  "  °o(1  +  >o)n  “d! f  +  [V  +  Yo}  +  l]Ec 

an 


+  D  P  (y 

o  c .  o  o 


Y  J  +  d  [(1  +  y  ) P  -  K  ]  =  E 
or  or  o  o  o  or 


Equations  (59)  through  (65)  reduce  to  equations  (104)  through  (113)  of  Reference 
9  for  the  case  D0  =  3  and  dQr  =  0  .  The  values  of  dQr  and  djr  can  be  obtained 
from  an  appropriate  Lagrangian  formalism. 

From  the  Gibbs-Helmholtz  equation  for  a  radiation  system 
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one  obtains  the  following  estimate  for  the  void  ratio  of  a  fractal  mechanical 
radiation  system  in  matter 
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w.here  the  notation  Er(D„dr)  =  fractal  radiation  energy  density,  and  Er(3,0)  and 
Pr(3,0)  =  homogeneous  radiation  energy  density  and  pressure  respectively,  will  be 
introduced  for  calculating  the  void  ratios  of  the  fractal  radiation  field.  The 
T  =  0  limit  of  equation  (67)  is  given  by 
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Specific  examples  of  the  use  of  these  equations  for  radiation  in  gases  and  con¬ 
densed  matter  are  given  in  Sections  7  and  8. 

4.  GROUND  STATE  LAGRANGIAN.  Lagrangian  formulations  of  nonlocal  gauge 
field  theories  have  been  used  to  describe  the  four  basic  interactions  that  occur 
in  nature.6’7  One  is  tempted  to  write  a  similar  Lagrangian  formulation  of  the 
effects  of  the  vacuum  state  on  bulk  matter.  This  section  develops  a  Lagrangian 
description  of  a  nonlocal  gauge  theory  of  relativistic  thermodynamics.  Let  the 
Lagrangian  function  of  a  relativistic  thermodynamic  system  be  written  as 

t  _  Oz-li  ii  a  ..  (b cn 


=  JC(£  ,  f  .♦.v.tX 


and  the  thermodynamic  action  I  be  written  as 
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where  C  -  Lagrangian  density,  0  =  |i(v,t)  is  an  appropr iate  1  v  selected  field, 
and  where 


v  =  tn  v 
t  =  &n  T 


The  introduction  of  the  variables  in  equations  (70)  and  (71)  is  made  because 
it  simplifies  the  ground  state  renormalization  group  equations  (18)  and  (19) 
which  now  become 


/  3C  \  d  /  3C  \  =  3 C 

: \3<t»  rJ  dv \ 3 J  34> 
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where  the  following  notation  was  introduced 
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The  Lagrangian  density  £(<t>>t  ,  o>v  ,<J>,v,t)  and  the  field  4>(v,t)  are  se¬ 
lected  in  an  appropriate  way  for  relativistic  thermodynamics  so  that  the  Euler- 
Lagrange  equations  (74)  will  reproduce  the  ground  state  renormalization  group 
equations  (72)  and  (73).  In  order  to  reproduce  equation  (72)  one  takes  t>  =  I 
where 


f,  =  j  Edv  =  {E  ^-  =  UV.T.D) 


The  corresponding  Lagrangian  density  is 
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The  Eulet-Lagrange  field  equations- derived  from  SI  =  0  are1*3 
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where  ^>vv  =  E>v  is  treated  a6  a  parameter  dependent  of  v  and  t  but  independent 


of  C  >  5,v  or  E,t  •  In  order  to  see  that  equation  (72)  can  be  derived  from 
and  equation  (74)  it  is  noted  that 
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Placing  these  quantities  in  equation  (74)  yields  equation  (72) . 


In  a  similar  fashion,  in  order  to  reproduce  equation  (73)  one  takes  4=4 


4  =  /Pdv  =  JP  ~  =  C(V,T,D) 


The  corresponding  Lagrangian  density  is 
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where  5>vv  =  PjV  is  treated  as  a  parameter  dependent  on  v  and  t  but  independent 
of  t  ,  t  v  or  C)t  .  The  following  relationships  hold 
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=  G  +  Ht 


Placing  equations  (95)  through  (99)  into  equation  (74)  shows  that  C 2  is  a 
proper  Lagrangian  density  for  the  pressure  renormalization  group  equation  (73) . 


It  should  be  pointed  out  that  the  Lagrangian  densities  and  C 2  have  a 
proper  T  =  0  limit  and  are  given  by 
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where 
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where  f0  ,  hQ  ,  MQ  and  N0  are  defined  in  Reference  9.  Equations  (100)  and 
(101)  yield  the  following  T  =  0  ground  state  equations 
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The  two  potential  functions  associated  with  Cj  and  C2  are  obtained  from 
equations  (77)  and  (89)  to  be 
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v2  =  GC  +  y  HC2 


(114) 


If  B  <  0  ,  G  <  0  and  C  >  0  ,  H  >  0  the  potentials  have  a  minimum  at  certain 
values  of  C  and  £  which  are  determined  from 


(115) 


where  6  and  ^  are  varied  by  changing  the  fractal  dimension  D  for  fixed  V  and  T. 
The  conditions  in  equation  (115)  are  equivalent  to 
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where  5^(V,T)  and  Cjj(V,T)  =  values  of  £  and  i;  for  which  and  V£  are  respective¬ 
ly  minimum.  Taking  the  derivatives  of  equations  (118)  and  (119)  with  respect  to 
v  yields  respectively 
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Either  equation  (120)  or  (121)  can  be  solved  for  D  =  DM(V,T)  that  makes  the 
potential  V[  and  V2  have  minimum  values.  About  the  minimum,  the  potentials 
have  the  .form 
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where  £  =  +  A£'  and  4  =  4m  +  ^  .  Thus  the  potentials  are  locally  symmetric 

about  the  minimum  points.  The  value  of  D  at  the  minimum  points  will  be  desig¬ 
nated  as  Dj^(V,T)  ,  and  in  general  Dj^  <  3  so  that  matter  will  have  voids  and  is 
fractal  in  nature  for  specified  values  of  V  and  T.  The  fractal  ground  state  of 
matter  occurs  only  for  limited  regions  of  V  and  T  corresponding  to  the  condi¬ 
tions  B  <  0  ,  G  <  0  and  C  >  0  ,  H  >  0  . 


It  should  be  pointed  out  that  the  Lagrangians  £ ^  and  £2  are  not  unique  in 
the  sense  that  the  following  two  Lagrangians  also  yield  the  desired  renormal¬ 
ization  group  equations 
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C2  =  7  +  ,  +  4(4  +  X4  ) 
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These  Lagrangians  are  linear  in  £  and  4  respectively  and  do  not  contain  quadrat- 
is  potential  terms  as  in  the  cases  of  equations  (77)  and  (89).  The  quadratic 
Lagrangians  are  chosen  instead  of  the  linear  potential  Lagrangians  because  they 
are  symmetrical  about  their  minimum  values. 


5.  EXCITED  STATE  LAGRANGIAN.  The  radiation  renormalization  group  equations 
(45)'  and  (46)  can  also  be  obtained  from  a  Lagrangian  formulation.  The  Lagrangian 
density  that  gives  equation  (45)  is 
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where  Cr>vv  =  tr  v  *-s  treaCed  as  parameter  which  is  dependent  on  v  and  t  but  is 
independent  of  £r  ,  £r  v  or  t  .  To  show  that  the  Euler-Lagrange  equation 
(74)  yields  equation  (45)  when’(+>  =  it  is  noted  that 
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dtK,t/  =  f r  3c 
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Combining  equations  (133)  through  (136)  with  equation  (74)  yields  equation  (45) 


The  Lagrangian  density  that  yields  equation  (46)  is  found  by  choosing 
'=  4r  in  equation  (74)  with  C  =  C2r  where 
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If  4rjW  is  taken  to  be  a  parameter  only  dependent  on  v  and  t,  one  has 
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which  combine  with  equation  (74)  to  give  equation  (46).  It  should  be  noted 
that  the  Lagrangians  Cjr  and  C2r  have  natural  extensions  to  the  case  T  =  0  . 


In  a  form  similar  to  that  in  equations  (113)  and  (114),  the  potentials 
associated  with  radiation  in  matter  are  given  by 
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H| 

(149) 
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If  Br  <  0  ,  Cr  <  0  ,  and  Cr  -•  0  ,  llr  0  these  potentials  will  have  minimum 

values  at  specific  values  of  i  and  cr  given  by 
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where  £rM(v»T>D)  and  Cry((V,T,D)  =  values  of  £r  and  z,r  for  which  Vj^.  and  V2r 
are  respectively  minimum,  and  where  the  variation  of  £r  and  £  in  equations 
(150)  and  (151)  corresponds  to  a  change  in  dr  for  fixed  values  of  V,  T,  and  D. 
The  value  of  the  fractal  dimension  of  radiation  in  fractal  matter  is  Dr  =  D  -  dr 
generally,  where  dr  >  0  so  that  in  general  there  will  be  voids  in  a  mechanical 
radiation  field.  If  the  value  dr  =  drf^(V,T,D)  minimizes  the  potentials  V^r  and 
V2r  ,  then  the  fractal  dimension  of  mechanical  radiation  in  fractal  matter  is 


drM  ,  and  the  fractal  dimension  of  radiation  in  homogeneous  matter  is 


Dj-j^  =  3  -  drM  .  Mechanical  radiation  in  matter  is  fractal  in  nature,  and  this 
includes  waves  in  gases,  liquids,  and  solids  for  limited  regions  of  temperature 
and  density  where  Br  <  0  ,  Gr<  0  and  Cr  >  0  ,  Hr  >  0  .  Note  that  in  general 
Dm  <  3  so  that  DrM  <  3  .  Finally,  when  Vir  and  V2r  have  minimum  values  they  can 
be  expanded  about  these  minimum  values  by  writing  +  A£r  and 

£r  =  4rM  +  A?r  as  follows 
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Electromagnetic  radiation  in  matter  will  be  treated  in  another  paper. 


6.  THERMODYNAMIC  NOETHER  CURRENT  TENSOR.  Because  the  renormalization 
group  equations  can  be  derived  from  a  variational  principle,  there  exists  a 
formal  procedure  for  determining  the  conservation  laws  as  a  result  of  the  form 
invariance  of  the  Lagrangian  density  under  continuous  transformations.  This 
procedure  is  given  by  Noether's  theorem.''0  If  the  coordinates  of  a  field 
ii(xu)  undergo  a  continuous  translation  of  the  form 

x„  x'  =  x  +  Ax  (154) 


then  the  Noether  tensor 
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satisfies  the  conservation  law 
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In  this  paper  it  will  be  assumed  that  the  thermodynamic  Lagrangian  densities 
are  form  invariant  under  continuous  changes  of  volume  and  temperature" of  the 
form 


t  -*•  t'  =  t  +  it 


(157) 


V  -*■  v'  =  v  +  £v  (158) 

This  is  true  for  the  Lagrangian  densities  of  relativistic  thermodynamics  because 
they  are  ultimately  expressed  in  terms  of  the  energy  density  and  pressure,  and 
these  quantities  are  form  invariant  under  continuous  changes  of  volume  and  tem¬ 
perature. 

The  Noether  current  tensor  for  the  energy  density  of  the  ground  state  of 
a  thermodynamic  system  is  given  by 
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which  satisfy  the  following  conservation  equations 
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The  components  of  the  Noether  tensor  for  the  ground  state  energy  density  are 
obtained  from  equations  (159)  through  (162)  by  using  the  expression  for  Cj 
given  in  equation  (77)  as  follows 


^  =4-  A42  -  1C 

vv  2  ,  v  ,  t 


UB  -  *  )  -  j  CL* 


(165) 


♦ft  '  '  7  «fv  -  Bt  -  1  c s2 


(166) 


<t  =  IC 

vt  ,  v 


(167) 


\v  =  ^t(A^,v  +  f;Z> 


(168) 


The  Noether  tensor  for  the  ground  state  pressure  of  a  thermodynamic  system 
is  written  as 
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which  satisfy  the  following  conservation  equations 
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The  solution  of  the  conservation  equations  (163),  (164),  (173),  and  (174)  yields 
the  conserved  quantities  of  the  ground  state  ol  renormalized  relativistic  thermo 
dynamics . 


The  Noether  tensor  for  the  radiation  energy  density  in  a  thermodynamic  sys¬ 
tem  is  given  by 
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The  Noether  tensor  for  the  radiation  pressure  in  a  thermodynamic  system 
is  given  by 
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while  the  conservation  equations  are 
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The  simultaneous  solution  of  equations  (183),  (184),  (193),  and  (194)  yield  the 
conserved  quantities  for  radiation  in  matter. 


7.  VOID  RATIOS  FOR  THE  REAL  GASES.  In  order  to  use  the  expression  for  the 
ground  state  void  ratio  given  in  equation  (37)  it  is  necessary  to  calculate  the 
energy  density  and  pressure  of  real  gases  for  both  the  fractal  and  homogeneous 
cases.  For  the  homogeneous  case  with  D  =  3,  the  renormalized  pressure  and  ener¬ 
gy  density  are  given  by  10>21 


P ( 3 )  =  nRT[l  +  nBa  +  n^C(3)  +  •••] 


(199) 


nRT[|-  nT 


1  n2T  :)C<3> 

2  3T 


(200) 


where 


-  C  -  3(B  )  In  t|/c 


(201) 


where  Ba  and  Ca  =  unrenormalized  second  and  third  virial  coefficients  respec¬ 
tively,  and  C(3)  =  renormalized  third  virial  coefficient  for  the  uniform  D  =  3 
real  gas,  and  where  i|/a  [not  to  be  confused  with  equation  (16)]  is  a  function  of 
the  second  virial  coefficient  given  by10’21 


a  T  B  (T)  2/3 

^  =  T-  1 - 

R  Ba(T  ) 


(202) 


where  =  species  dependent  relativity  temperature  of  real  gases.  The  cor¬ 
responding  state  equations  for  a  fractal  real  gas  are  written  as 


P(D)  =  nRT[l  +  nBa  +  n~C(D)  +  •••] 


(203) 
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(204) 


E(D)  =nRT[f-nT^-In2T^-  ...] 


where 


-.3  ^  ,_aN  2  .  a 


C (D)  =  C  -  D(B  )  In  <p‘ 


(205) 


In  order  to  obtain  equation  (205)  it  is  necessary  to  assume  that  D  is  indepen¬ 
dent  of  T  and  n. 

Combining  equations  (199),  (200),  (203),  and  (204)  with  equation  (37)  gives 


Y  4  n2T  W  [C(3)  -  C(D)] 


(206) 


where  the  following  approximation  was  used  in  equation  (37) 


E(3)  +  P (3)  -  T  3P3(-3-)-  -v  |  nRT 


(207) 


Combining  equations  (201),  (205),  and  (206)  gives 
=  -  -y  (3  -  D)T  [(B3)2  In  *a]  +  ••• 


(208) 


=  -y  (3  -  D)T  [c (3)  -  Ca]  +  •• 


Note  that  in  general 


C (3)  -  C (D)  =  j.(3  -  D)[C(3)  -  Ca] 


(209) 


Thus  voids  will  exist  in  the  ground  state  of  real  gases  only  in  the  temperature 
intervals  for  which  AV/V  >  0  in  equation  (208).  This  condition  gives 


Yf  [(BV  In  ■/]  <  0 


(210) 


or  equivalently 


[ C ( 3 )  -  C3]  >  0 

J  1 


(211) 


as  shown  in  Figure  1.  From  Figure  l  it  is  clear  that  the  largest  size  voids 
will  occur  at  low  temperatures.  There  is  also  a  narrow  fractal  region  just 
above  the  Boyle  temperature,  and  a  broad  fractal  region  at  high  temperatures. 
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Ultimately,  the  voids  that  occur  in  real  gases  are  due  to  the  interaction  of 
these  gases  with  the  vacuum  state,  which  manifests  itself  in  the  third  and 
higher  virial  coefficients.  The  ideal  gas  is  not  fractal. 


Consider  now  mechanical  radiation  in  a  real  gas.  For  the  homogeneous 

case  with  D  =  3  and  d_  =  0  the  renormalized  radiation  pressure  and  energy 
■ _ .  2 1  L 


density  are' 


P  (3,0)  =  uRTf-^k^2  +  nB3  +  n2C  (3,0)  +  •••] 
r  12  oo  r  r 


(212) 


12  2  3Br  1  2  3Crn.O) 

Er(3,0)  -  nRT[jkoAo  -  „T  —  -  j  „2T  -r  - 


(213) 


where  kQ  and  AQ  =  wave  number  and  amplitude  of  waves  in  an  ideal  gas,  and  where 
=  nonrelativistic  (unrenormalized)  second  radiation  virial  coefficient,  and 
Cr(3,0)  =  relativistic  third  virial  coefficient  for  a  homogeneous  (D  =  3  and 


dr  =  0)  mechanical  radiation  field  given  by' 


C  (3,0)  =  Ca  -  3[2BaB3  +  (B3)2]  In  f3  -  3(B3  +  B3)2  ln{ 1  +  — )  (214) 


where  <pa  is  given  by  equation  (202)'  and  [not  to  be  confused  with  equation 
(51)]  is  given  by 2 1 


B3(T)  +  B3(T) 


*a  +  *3=f  — - - 

r  tr  B3(Tr)  +  b3(tr) 


(215) 


The  procedure  for  calculating  B§  and  is  given  in  Reference  21.  The  corre¬ 
sponding  state  equations  for  fractal  radiation  in  a  fractal  real  gas  are  given 
by 


P  (D.d  )  =  nRT[  A-k2A2  +  nB3  +  n2C  (D,d  )  +  •••] 
rr  12  oo  r  rr 


(216) 


.  0  iC  (D.d  ) 

Er<D’V  ■  ">ThV.  -  nT  TF  -  7  "'T  1T  C 


(217) 


where  Cr(D,dr)  =  relativistic  third  virial  coefficient  for  a  fractal  mechan¬ 
ical  radiation  field  in  a  fractal  real  gas  and  is  given  by 
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2 

C  (D ,  d  )  =  C3  -  D[2BaBa  +  (B3)  ]  In  ya  -  D(Ba  +  B3)2  tnll  +  —  )  (218) 

r  r  r  r  r  r  \  ,  a  / 


+  dr(Bd)  In  ■■/ 


where  the  fractal  dimension  of  the  mechanical  radiation  in  a  real  gas  is  now 
Dr  =  D  -  dr  which  is  lower  than  the  fractal  dimension  D  of  the  ground  state  of 
a  real  gas.  Note  that  in  order  to  obtain  equation  (218)  it  is  assumed  that  D 
and  dr  are  independent  of  T  and  V. 

The  calculation  of  the  void  ratio  for  radiation  in  a  real  gas  then  pro¬ 
ceeds  from  equation  (67).  Note  first  the  following  approximation 


3PCS.0)  ,  2 

E  (3,0)  +  P  (3,0)  -  T  — — -  -v  f  nRTk“A 

r  r  3T  4  o  o 

Combining  equations  (213)  and  (217)  gives 

Er(D,dr)  -  Er(3,0)  =  |  n3RT2  [Cr(3,0)  -  Cr(D,dr)] 
Placing  equation  (219)  and  (220)  into  equation  (67)  yields 

(¥)  =¥4t^«.°>  -  ct<».dr>]  +  ••• 

r  k  A 


(219) 


(220) 


(221) 


o  o 


where 


C  (3,0)  -  C  (D,d  )  =  -(3  -  D)  [2Ba3a  +  (Ba)2]  In  ^3  +  (Ba  +  B3)2  bill  +  — 


i  /n3  \  2  ,)  a 
-  dr(B  )  bl  v 

Placing  equations  (201)  and  (214)  into  equation  (222)  yields 


(222) 


1  d 

•Cr(3,0)  -  Cr(D.dr)  =  j  (3  -  D)[Cr(3,0)  -  C3]  +  -y  [  C(3)  -  C3]  (223) 

Combining  equations  (221)  and  (223)  gives  the  following  expression  for  void 
ratio  of  mechanical  radiation 
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It  has  been  shown  that  the  coefficients  Ba  ,  Br  ,  Ca  ,  and  Cr(3,0)  are  all  pro¬ 
portional  to  ICqAq  . 21  In  addition,  the  radiation  fractal  decrement  dr  is  pro¬ 
portional  to  the  radiation  energy  density  so  that 


d  =  t  E  (D.d  ) 
r  r  r  r 


(225) 


1  2  2 
=  T  t  nRTk  A  +  • 
4  r  o  o 


where  tr  is  independent  of  kQ  and  Aq  .  It  then  follows  that 


C  (3,0)  =  k2A2c'(3,0) 
r  oor 


(226) 


C  (D,d  )  =  k  A  c'(D,d  ) 
r'  r'  oor'  *  r 


(227) 


k2A2Ca' 

oor 


(228) 


where  Ca  ,  C^.(3,0)  ,  and  C^(D,dr)  are  independent  of  kQ  and  Ac 
equation  (224)  can  be  written  as 


Therefore 


(t)  =  f  n2|(3  "  D)T  W  fCr(3,0)  "  +  {  TrnRx2  Jr  tC(3)  -  C3]  (  +  (229) 


The  void  ratio  for  mechanical  radiation  in  real  gases  is  independent  of  wave 
amplitude  and  frequency,  and  depends  only  on  temperature  and  density.  For  the 
case  of  fractal  mechanical  radiation  in  a  homogeneous  (D  =  3)  ground  state,  equa¬ 
tion  (229)  becomes 


(f)r  ■  i  v3rt2  5Y  [c(3>  -  c°] +  ••• 


(230) 


Equation  (230)  is  somewhat  similar  to  the  result  for  the  ground  state  in  equa¬ 
tion  (208) . 


Even  if  the  ground  state  is  homogeneous  (D  =  3)  the  mechanical  radiation 
state  can  be  fractal  with  fractal  dimension  Dr  =  3  -  dr  .  In  general,  however, 
the  ground  state  may  be  fractal,  and  the  fractal  dimension  of  the  radiation 
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i 

I 

field  will  be  written  as  =  D  -  df  .  Only  in  limited  temperature  regions  I 

will  mechanical  radiation  in  real  gases  have  fractal  properties,  i.e.,  those  ; 

regions  for  which  (AV/V)r  >  0  in  equation  (229).  The  fractal  dimension  Dr  ' 

of  mechanical  radiation  can  be  calculated  from  the  general  radiation  Lagrangian  \ 

formalism  given  in  Section  5.  The  voids  in  a  mechanical  radiation  field  are  1 

due  to  the  interaction  of  the  excitations  of  a  real  gas  with  the  vacuum  state.  \ 

Mechanical  radiation  in  an  ideal  gas  is  not  fractal.  j 

8.  VOID  RATIOS  FOR  FRACTAL  SOLIDS  AND  QUANTUM  LIQUIDS.  In  this  section  ! 

the  void  ratios  for  the  fractal  ground  state  and  excited  states  of  solids  and  | 

quantum  liquids  are  calculated.  For  simplicity  only  the  T  =  0  state  will  be 
considered.  In  order  to  use  equation  (38)  for  the  calculation  of  the  ground 
state  void  ratio,  the  T  =  0  energy  density  and  pressure  must  be  calculated  for  ; 

both  the  fractal  and  homogeneous  states.  It  will  be  assumed  that  | 


E*  -  An°°  (231) 

P“  -  <%  -  DE*  (232) 

where  A  and  oQ  =  constants  independent  of  density.  Using  equation  (29)  with 
Y 0  and  D0  taken  to  be  constants  independent  of  density  gives  the  renormalized 
energy  densities  of  the  fractal  and  homogeneous  systems  respectively  as 


E  (D  ) 
o  o 


Ea 

c(5J 


(233) 
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Ea 

_ o_ 

G(3) 


(234) 


where 


G(D  )  =  D  u2  -  D  o  (2  +  y  )  +  D  (1  +  y  )  +  1 

o  oooo  o  o  o 

G (3)  =  3o2  -  3(2  +  y  )vi  +  3y  +4 
o  o  o  o 

Then  the  difference  in  energy  densities  for  the  fractal 
is  given  by 


(235) 

(236) 

and  homogeneous  states 
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Eo(3)  = 


(3  -  D  )F  E* 
o  o  o 


G(3)G(Do) 


(237) 


F  =  o 
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which  satisfies 
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G(D  )  =  D  F  +  1 
o  o-o 


Combining  equations- (231)  and  (234)  gives 


dE  (3)  0  En 

o  o  o 


(238) 


(239) 


(240) 


dn  G(3) 


(241) 


and  therefore  equation  (38)  gives  the  following  expression  for  the  T  =  0  ground 
state  void  ratio  for  solids  and  quantum  liquids 


(t)  - 


CW  -  Eo<3)]G(3)  <3  -  VFo 


rO- 
Cl  C 
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o  G  (D  ) 
o  o 


(242) 


In  addition  to  the  obvious  dependence  of  DQ  ,  the  void  ratio  depends  on  the 
constants  aQ  and  yQ  . 

Often  in  the  literature  the  average  energy  per  particle  is  introduced  as 
follows 1 0 
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F  =  ^  (  $  -  Y  ) 
o  3  3  o 


(247) 


These  functions  are  expressed  in  terms  of  both  k  and  yq  . 

Of  particular  interest  are  the  cases  of  the  non-relativistic  and  ultra- 
relativistic  non-interacting  degenerate  T  =  0  Fermi  gases.  The  non-relativ¬ 
istic  Fermi  gas  has  the  following  properties 


Yo  =  2/3 


o  =  5/3 
o 


(248) 


G(Dq)  =  1 


F  =  0 
o 


G(3)  =  1 

For  the  ultra-relativistic  case  these  quantities  are 


Y  =  1/3 
o 


k  =  1 


o  =  4/3 
o 


(249) 


(250) 


G(3) 


G(D  )  -  1 
o 


F  =  0 
o 


Because  FQ  =  0  for  the  ideal  non-relativistic  and  ultra-relativistic  Fermi 
gases,  it  is  clear  from  equation  (242)  that  no  voids  exist  for  these  cases. 
Ideal  non-relativistic  and  ultra-relativistic  Fermi  gases  are  homogeneous  with 
D0  -  3  .  ‘ 

in  general  the  GrUneisen  parameter  yQ  is  a  function  of  the  index  <  .  For 
instance,  the  effective  mass  approximation  for  an  interacting  system  gives  the 
following  expression  for  the  zero-temperature  GrUneisen  parameter10 


,  <  -  4  7 

Y  %  - - -  =  n  —  — 

Yo  3  o  3 


Using  this  relationship  gives 


(252) 


1  o 


G(3)  ^1+4^=  4o  -  3  >  0 
J  o 


(253) 


G(D  )  1  + 
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4k  D 

o  _  4  „ 

— 5 —  T  D  0 

9  3  o  o 


<IDo  -  1)  '  0 


<254) 


4  k.  4 

V  T  =  3  (U0  -  ”  >  0 


(255) 


Equation  (252)  is  valid  only  for  k  ^  5  or  aQ  >  8/3  .  In  general  the  value  of 
D0  for  solids  or  quantum  liquids  may  be  determined  experimentally  or  possibly 
theoretically  from  a  T  =  0  Lagrangian  formulation  outlined  in  Section  4.  In 
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general  T  *  0  solids  and  quantum  liquids  are  fractal  for  all  physical  densities 
because  equation  (242)  shows  that  (AV/V)0  >  0  . 

Consider  now  the  void  ratio  of  mechanical  waves  in  a  solid  or  quantum 
liquid.  For  simplicity  only  waves  in  a  T  =  0  system  will  be  considered.  The 
wave  equation  (59)  can  be  simplified  by  using  equations (23 1 )  through  (234)  and 
the  approximation  E j r / E j  =  Eor / EQ  with  the  result  that 
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w 

in 


D  3 r  P  ( y  -  Y  )=DE  (o  -  1 )  (  y  -  Y  ) 
o  t.  o  o  or  o  or  o  o  or 

J 

and  therefore  equation  (65)  can  be  rewritten  as 


(256) 


2  d2E 
2  or 


+  C  E  =  Ea 
dn  o  or  or 


(257) 


where 


C  =Do(y  -Y  )  +  D  (1  +  y  )  +  1  +  t  [(1  +  y  )P  -  K  ] 
o  o  o  o  or'  o  or  orL  o'  o  o 


(258) 


and  where  the  relation  dQr  =  torEor  is  used  which  is  similar  to  equation  (2-25)-  ' 
that  was  used  for  the  real  gases.  Assume  now  that9 


E  =jKK2A2 
or  4  o 
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where  and  Afl  =  nonrelat ivist ic  wave  number  and  wave  amplitude  respectively, 
and  k  and  A  =  relativistic  wave  number  and  wave  amplitude  respectively.  Placing 
equation  (259)  into  equation  (257)  gives 
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where 
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Calculate  also  the  following  expression 
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Then  equation  (68)  gives  the  void  ratio  for  mechanical  waves  as  follows 


(271) 


(3  -  D  )F  +  x  [k  -  (1  +  v  )P  ] 
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Note  that 


Gr(3,0)  -  3For  +  1  (273) 

VV'ir’  ■  °oFor  +  1  +  +  VPo  -  K„]  ' 

A  homogeneous  ground  state  (D0  =  3)  C3n  have  a  fractal  radiation  excited  state 
described  by 
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Both  equations  (272)  and  (275)  are  valid  only  for  (AV/V)  >  0  and  this  re¬ 
stricts  the  density  regions  in  which  radiation  voids  are  possible.  Only  in 
the  unphysical  regions  below  equilibrium  density  are  equations  (272)  or  (275) 
negative.  In  fact  at  the  equilibrium  density  of  a  T  =  0  solid  or  quantum 
liquid  one  has  PQ  =  0  so  that  (AV/V)or  >  0  at  this  point.  Radiation  voids 
are  also  possible  in  the  high  density  regions  beyond  the  equilibrium  density 
of  an  interacting  system.  Note  that  for  ideal  non-relativistic  and  ultra-rel¬ 
ativistic  T  *  0  Fermi  gases  (1  +  Y0)P0  -  KQ  =  0  »  and  D0  =  3  ,  and  from  equation 
(272)  the  radiation  state  is  homogeneous. 


9.  CONCLUSION.  The  renormalization  group  equations  for  fractal  matter 
and  fractal  mechanical  radiation  are  presented  and  a  Lagrangian  formulation 
of  these  equations  is  developed.  For  limited  temperature  and  density  regions 
the  equilibrium  fractal  dimensions  of  the  ground  and  excited  states  of  real 
gases,  solids,  and  quantum  liquids,  may  possibly  be  obtained  by  minimizing  the 
Lagrangian  density  with  respect  to  the  fractal  dimension  of  the  system.  Ideal 
non-relativistic  and  ultra-relativistic  quantum  thermodynamic  systems  are  not 
fractal.  For  interacting  thermodynamic  systems  the  ground  and  excited  states 
are  fractal  in  nature.  The  ground  and  excited  states  of  real  gases  are  fractal 
only  in  limited  temperature  regions.  Although  in  general  D  ^  3  it  has  not  been 
shown  that  the  voids  in  matter  and  mechanical  radiation  fields  are  self  similar, 
which  is  a  basic  characteristic  of  fractal  systems. 
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MODELLING  OF  THE  LEAN  FLAMMMABILITY  LIMIT  IN  FLAME  THEORY1 
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ABSTRACT.  The  phenomenon  of  the  lean  flammability  limit  is  modelled  by  a 
four-step  reaction  mechanism.  Analytical  results,  namely  that  a  fuel  mixture 
will  not  burn  if  it  is  too  lean,  are  obtained  through  use  of  activation-energy 
asymptotics. 

I .  INTRODUCTION 

The  use  of  the  one-step  irreversible  reaction  in  combustion  modelling  has 
wielded  much  success,  particularly  in  the  context  of  activation-energy 
asymptotics  [1],  However,  the  neglect,  of  radicals,  or  intermediates  as  they 
are  sometimes  called,  has  precluded  the  modelling  of  some  important  phenomena. 
One  of  the  these,  the  lean  flammability  limit,  is  the  subject  of  study  in  this 
paper. 

By  the  lean  flammability  limit,  we  mean  the  phenomenon  whereby  a  fuel 
mixture  is  incapable  of  burning  when  it  is  too  lean.  Of  course,  a  mixture  may 
not  burn  due  to  other  causes,  e.g. ,  excessive  heat  loss,  flow  divergence, 
etc.,  the  modelling  of  which  has  been  successfully  done  by  the  use  of  the 
one-step  model.  The  lean  f lammabi 1 i ty  limit,  however,  differs  from  such 
external  effects  in  that  it  involves  a  property  of  the  mixture  itself,  namely 
the  fuel  strength.  It  has  been  conjectured  that  the  cause  lies  in  the 
chemistry,  i.e.,  the  reaction  mechanism,  thus  necessitating  the  use  of 
multi-step  kinetics  in  the  modelling  of  this  phenomenon. 

__  ~  f 

The  two-step  Zeldovich-Linan  mechanism  [2]  and  other  simple  multi-step 
schemes  studied  by  Fife  and  Nicolaenko  [3]  uncovered  many  flame  phenomena, 
among  which  are  stretch-resistance  ([4],  [5]),  hysterisis,  flame  plateaux  and 
kinetic  extinction  (op.  cit.).  but  they  cannot  model  the  lean  flammability 
limit.  Peters  and  Smooke  [6]  attempted  to  model  the  phenomenon  by  using  a 
four-step  model: 
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F  +  R, 


R2  +  °2 


R2  +  °2  +  M 


R, 


2 

2R 


1 


PT  +  M 


2R}  +  M  - *  ?2  +  M  . 
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Here  F  is  the  fuel.  02  the  oxidant.  Rj  and  R2  the  radicals, 


P  j  and  Pg  the 


products,  and  M  a  third  body.  Close  examination  of  their  results,  however, 
shows  that  the  phenomenon  modelled  there  was  the  kinetic  extinction  phenomenon 
whereby  a  mixture,  if  it  burns  at  all,  burns  at  or  above  a  critical 
temperature.  We  shall  not  discuss  the  details  here,  but  merely  note  that  they 
can  be  found  in  [7]. 


We  consider  here  the  four-step  model 


F  +  R, 


r2  +  c 


R2  +  C 


R2  +  °2 


2P2  +  M 


Here  F  and  are  respectively  fuel  and  oxidant,  ,  R2  and  C  are 


intermediates,  P^  and  P 2  are  products  and  M  is  a  third  body.  The  first  two 


reactions  have  high  activation  energy  while  the  last  two  have  small  ( taken  as 
zero  here)  activation  energy. 


II.  GOVERNING  EQUATIONS 


To  focus  on  the  chemistry,  we  consider  the  model  in  the  context  of  the 
■steady  plane  flame,  which  is  governed  by  the  following  dimensionless  system  of 
equations : 
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Here  Y,  R,  S,  C,  X,  and  B,  respectively,  are  the  normalized  mass  fractions  of 


the  fuel  F  ,  radicals  R^ ,  Rg.  and  C,  oxidant  0 2  and  third  body  M,  T  is  the 


temperature;  S£,  J V,  C.  and  $  are  the  Lewis  numbers  (assumed  constant)  of  F, 
R^,  Rg.  C,  and  X  respectively.  The  non-dimensional ized  heat  releases  of  the 


four  reactions  are  q^  13.  and  q^  respectively,  whereas  2).  ,  2 2)3>  and  2)^ 


are  the-  Damkohler  numbers,  assumed  independent  of  temperature.  The 
non-dimensional ized  activation  energy  0  is  derived  from  the  first  reaction 
and  r  =  E 2/Ej  •  t*le  ratio  of  the  activation  energies  of  the  second  and  first 


reactions,  is  taken  to  be  less  than  2  for  simplicity.  These  equations  are  to 
be  solved  under  the  boundary  conditions 


Y,  R.  S,  C.  X,  T 


Yj.,  0,  0,  0,  X^ ,  T^  as  x 


boundedness  is  required  as  x 


Assuming  that  the  Damkohler  numbers  are  ordered: 


and  denoting  by  T^  the  flame  temperature  such  that 


2),  <  2k  <  2k  <  2L 

3  4  2  1 


2!2e  r0/T23  =  ffl3  , 


one  can  show  that  no  flame  exists  with  flame  temperature  below  T23;  a  detailed 


discussion  can  be  found  in  [8],  Apparently,  this  is  the  kinetic  extinction 
phenomenon;  analysis  of  the  flame  structure,  however,  reveals  more.  We  now 


focus  our  attention  on  temperatures  close  to  Tr 


III.  ASYMPTOTIC  ANALYSIS 


Solution  to  the  system  (1)  -  (7)  will  be  sought  in  the  limit  of  infinite 
activation  energy,  whereupon  all  reactions  are  confined  within  the  reaction 
zone  where  the  appropriate  coordinate  is  f  =  0x  .  The  mass  fractions  aree 
ordered 


r  =  e'1?.  r  .  s  .  e-2  »(r -2KT.  s 

4  2  4 


C  =  Ch  +  0  *C  ,  X  =  Xb  +  0_1X  ,  T  =  Th  +  0-1  T 


& 

Sh 


||V 


where  the  0(1)  variables  in  the  f-structure  are  designated  by  a  tilde;  and 


the  subscript  b  denotes  the  burnt  state.  Since  for  close  to 


^e  0/T*  <  2>2e  r0/T*  ~  S3  <  a4  , 


R  and  S  are  exponentially  small;  more  precisely,  the  radicals  and  R^  are 

produced  and  consumed  entirely  within  the  reaction  zone,  so  that  on  the 
x-scale, 


Y  =  Y,  (1  -  e  j  .  R  =  S  =  0  ,  C  = 


\+  Xf)e 


Tf  +  (T*  -  Tf)  e* 


for  x  $  0  respectively.  Also,  we  find 


°b  =  Xf  "  Xb  • 


The  governing  equations  in  the  reaction  zone  are,  to  leading  order: 


(V^  ~  r\ 

-X1  4-DR  ?eT/TJ 

df2 


~  9  ~  Q 

0  =  -DR  Ye  *  +  2DSC^e  ** 


-  D?B 


0  =  DR  Y<J/T»  -  DSCber'''/T»  -  DS!^ 


-4j-  =  Di  7.?/T2  -  dSc,/^ 


r1  -iS.  =  -oix. 

df2  b 


J^T  T  rr2  A,  J 

=  qxDR  Ye171**  +  q  DSC^e  *  +  q3DSXb  +  q4DRH3 


n  D  „-3  *1  -20XT  „-3  D1  -29/T  „2 

=  *  T'  S  "57" 

M  4  4 

r 


k«. 


is  an  0(1)  parameter  and  is  to  be  determined  as  part  of  the  solution  for  Mr> 


the  reference  mass  flux. 


Equations  (11)  and  (12)  express  what  may  be  called  the  local  equilibria 
of  the  two  radical  species  and  Rg  :  given  Y,  we  find 

R  =  S  =  0  (16) 

is  always  possible,  whereas  there  is  a  second  solution 


~  vrT/1«  -  *b  ^ 

W  —  ...  •  ■  - 

iV  o  n 

r  „rT/TJ  +  y  B 

°be  *  +  \ 

(17) 

n 

-V  'V  'p 

S  =  R  Ye171* 

\ 

Tsr2 

for  6^er  *  -  >  0  .  Since  radicals  must  be  produced  in  the  reaction 

zone,  the  second  solution  must  hold  in  some  part  of  the  reaction  zone;  on  the 
other  hand,  that  part  cannot  lie  on  the  fresh  side  of  the  reaction  zone  since 

~  A/  A/ 

R  and  S  must  vanish  there  and  Y  does  not. 

We  conclude  that  the  reaction  zone  is  divided  at  f  =  f  (say)  such  that 
the  solution  (16)  holds  for  f  £  f*  and  (17)  holds  for  f  >  f*  .  At 

f  =  f  ,  continuity  of  the  solutions  require  that 

i  ,  .  *b . 

T(f  )  =  — —  ln(-^— •)  . 

Since  the  maximum  temperature  is  attained  as  f  - »  +  00  ,  it  follows  that 

Xb<Cb 

or,  equivalently,  because  of  (9), 


\<5*!  ■  (18) 

This  condition  in  turn  implies  that  Y^.  must  be  greater  than  a  certain  value 
for  the  solution  to  exist,  as  will  be  shown  later. 


UtUiuuutJun**  KRsnnMwiw  i  ww  uv- w 


u-a  i7%*  irfc  i/v  v 


Substitution  of  the  local  equilibria  expressions  (17)  into  the  fuel, 
temperature  and  oxidant  equations  (10).  (15),  and  (14)  yield 


-*-1  =  d 

df 


ctT/tJ  +  v 
°be  *  +  % 


B- 


(19) 


jfi. 

df2 


=  D  [  (q.  +  q-  -  q.)  +  (q0  -  q-  +  2q  ) 
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S' 


rT/T2 


*  +  X, 


s*rT/I^  *  S 


<v  ft 

~2e2T/T^ 
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(20) 


.-i  d2;  .  _d_c»°tT/1*  -  s 


~2  2T/T 
Yze  h 


df" 


(vrT/^  *  *J‘ 

Boundary  conditions  obtained  from  matching  with  the  outer  solution  (8)  are 

<v 

dY  =  -iT1  Yf  +  0(1)  .  4r-  =  (T*  -  Tf)  +  °0)  • 


(21) 


df 


dX 


’  df 


^  =  *<\  '  Xf>  + 


as 


f  if* 


(22) 


Y  =  o( 1 )  ,  T  =  o ( 1 )  .  X  =  o( 1 )  ,  as  f  - ►  +  »  . 

The  above  system  is  then  solved  for  the  special  case 


ql  ‘  q3  +  2q4  =  ° 


ql  +  q3  "  q4  >  ° 


for  which  an  analytical  solution  is  possible.  We  shall  not  give  the  details 
here  but  only  quote  the  final  results.  We  find 


Sf2  T®  D 


M 


4(qj  +  q2  +  q4)3  Y2  B 


I(  y*  :  r.  y*  ) 


(23) 


740 


xf  -  *b  =  - 
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(24) 


.Tl/2, 

41  (  y  ;  r.  y  ) 


_ ± _ 1  ■  -y  l _ 

r 

cosh  o  (  y  -  y)  +  i 


where 


I(y  :  r,  y  )  = 


1  - 


sr(y  -  y)/2  +  J 


2  -y  , 
y  e  dy 


(25) 


and 


y*  =  -f-  in  (  1) 


(26) 


(there  is  also  a  local  structure  for  the  radicals  and  at  f  =  f 

namely  that  which  connects  the  trivial  solution  (16)  with  the  local 
equilibrated  solution  (17).  Details  can  be  found  in  [8]). 


IV.  CONCLUSION 

For  the  special  case  considered,  the  relationship  between  Y^.  and 

is  given  by  (24)  and  a  similar  result  will  be  obtained,  albeit  numerically, 
in  general . 

Figure  1  gives  a  plot  of  /X^.  vs.  /X^  for  the  parameter  value 

r  =  ~  .  We  believe  it  is  typical;  it  shows  that 


xb<ixt 
corresponds  to 

Yf  >  Xf  =  Yfl  . 

Since  this  is  the  condition  under  which  a  solution  for  the  flame  obtains,  we 
call  Y^  the  lean  flammability  limit.  A  mixture  will  not  burn  if  its  fuel 

strength  is  below  Y^. 
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Introduction 


At  the  second  and  fourth  Army  Conference  on  Applied  Mathematics  and  Computing 
we  introduced  the  notions  of  loop  and  elementary  orbit.  The  implicational  strength 
of  loops  and  elementary  orbits  enabled  us  to  prove  two  refinements  of  Sarkovskii’s 
Theorem,  Theorem  (SR)  and  Theorem  (SR  II)  [2,3].  In  fact,  from  Theorem  (SR 
II)  follows,  as  a  corollary,  a  new  proof  of  Sarkovskii’s  Theorem  in  a  most  natural 
way.  If  we  call  each  of  the  (n  —  1)!  different  n- periodic  orbits  a  continuous  function 
/  :  R  — *  R  can  have  a  period  type,  one  is  naturally  led  to  what  we  have  called  the 
type-problem. 


Statement  of  the  Type-Problem.  Let  /  :  R  —*  R  be  continuous.  Given  a  positive 
integer  n  and  an  n— periodic  orbit  of  specified  type,  find  for  every  positive  integer  m 
the  types  of  m— periodic  orbits  that  must  exist. 


At  this  stage  of  knowledge  the  type-problem  is  an  open  problem  of  considerable 
complexity.  Even  the  restricted  (or  “little”)  type-problem  appears  to  be  of  great 
difficulty. 


Statement  of  the  Restricted  Type-Problem.  Let  /  :  R  — ►  R  be  continuous. 
Given  a  positive  integer  n  and  an  n-periodic  orbit  of  specified  type,  find  for  every 
positive  integer  m  <  n  the  types  of  m— periodic  orbits  that  must  exist. 

It  is  well  to  point  out  that  Sarkovskii’s  result  gives  only  the  (complete)  answer  to  the 
“typeless”  problem:  “Given  a  positive  integer  n  and  an  n— periodic  orbit  of  any  type. 
For  which  other  integers  m  does  there  exist  an  m— periodic  orbit  of  any  type?” 

In  the  first  part  of  this  presentation  the  solution  for  the  restricted  type-problem  for 
IV  <  4  is  given.  It  is  necessary  for  this  purpose  to  introduce  the  notion  of  a  separated 
loop,  a  direct  generalization  of  a  loop.  We  shall  see  that  separated  loops  do  not  obey 
a  linear  order.  This  is  in  contrast  to  loops  and  elementary  orbits  that  are  not  only 
linearly  ordered  individually,  but  also  when  taken  together.  Accordingly,  the  various 


period  types  that  appear  in  the  solution  for  the  restricted  type-problem  for  N  <  4 
are  not  linearly  ordered. 

In  the  second  part  of  this  presentation  our  notions  make  contact  with  the  notion 
of  turbulence  as  introduced  by  Block  and  Coppel  [4].  We  show  that  the  notions  of 
turbulence  and  infinite  loop  are  equivalent,  i.e.,  we  prove  that  “/  :  R  — *  R  is  turbulent 
if  and  only  if  /  has  an  infinite  loop.” 

Unless  new,  our  notation  is  unaltered  from  [l],[2],  and  [3]. 

This  presentation  represents  only  part  of  our  joint  work  under  the  U.S.  Army  Summer 
Faculty  Research  and  Engineering  Program. 

1.  The  Partial  Ordering  of  the  Separated  Loops. 

Definition:  A  p— periodic  orbit  (p  >  2)  is  called  a  (m, n)—  separated  loop  if  p  = 
m  +  n,  m,n  >  1,  and  the  points  of  the  orbit  satisfy 


Xm+n  Xo  ^  X\  <  ...  <  Zm_]  <  *^m+n- 1  ^ 


We  adopt  the  notation  that  Lmn  shall  mean  that  /  :  R  —*  R  has  a  (m, n)— separated 
loop. 


We  have  the 


Theorem:  Let  /  :  R  —*  R  have  a  (m,n)— separated  loop.  Then  there  exist  two 
(m  -  l,n)— separated  loops  and  two  (m,n  -  1)— separated  loops,  except  that  a  (2, 1) 
or  a  (1,2)— separated  loop  only  implies  one  (1, 1)— separated  loop.  In  particular, 


Lm- l,n  *•  Lm  n  r  Lm  n_  i . 


The  proof  is  a  direct  application  of  the  following  well-known  lemma. 

Lemma.  Let  Jx,  J2,  be  compact  intervals  such  that  /(./,)  D  =  1,2,  ...,n—  1 

and  f(Jn)  Then  there  is  a  point  z0  G  Ji  with  x,  G  =  l,2,...,n  -  1,  and 

x0  =  x„.  The  point  x0  has  period  n  or  n',  where  n'  divides  n. 

The  following  diagram  displays  the  partial  ordering  on  all  separated  loops. 


That  in  general  no  other  impl 


^2.1 

>2,r 


cations  hold  can  be  demonstrated  by  examples. 


2.  The  Solution  of  the  Restricted  Type-Problem 

for  N  <  4. 


The  six  types  of  4— periodic  orbits  are  given  by 

P\  *  X4  =  Xq  X3  Xt  X2  Z*2,2  •  £4  =  %0  ^  X^  ^  2-3  ^  X2 

i  .  X4  =  Xo  *C  Xi  <  X2  <  X3  Z/j  3  ;  X4  =  Xq  <C  X3  <C  X2  Xi 

£■4  :  X4  =  Xo  <  x2  <  Xi  <  x3  £<  :  x4  =  x0  <  x2  <  x3  <  ij 

We  present  the  solution  for  the  restricted  type-problm  for  AT  <  4  diagrammatically: 


1,3 


L 


3,1 


The  proofs  of  the  indicated  implications  follow  from  the  -Theorem  and  Lemma  in 
Section  1. 


Block  and  Coppel  introduced  in  [4]  the  concept  of  turbulence.  By  definition  a  contin¬ 
uous  function  /  :  R  — ►  R  is  turbulent  if  there  exist  compact  intervals  J  and  K  such 
that  J  f)  K  is  at  most  a  singleton  and 

To  show  the  equivalence:  “f  i  R  —*  R  is  turbulent  if  and  only  if  /  has  an  infinite 
loop”,  we  recall  a  convenient  notation  and  two  lemmas. 

For  intervals  J  and  K  we  write  J  <  K  if  x  <  y  whenever  x  E  J  and  y  E  K.  It  follows 
that  J  <  K  or  K  <  J  if  and  only  if  Jf)K  is  at  most  a  singleton. 

If  we  call  the  interval  J'  minimal  with  respect  to  the  property  f{J')  =  K  if  no  proper 
subset  of  J'  has  this  property,  the  first  well-known  lemma  reads  as  follows. 

Lemma  3.1.  If  /  :  R  — ►  R  is  continuous  and  J  and  K  are  intervals  such  that  K  is 
compact  and  /(«/)  D  K,  then  there  is  a  minimal  compact  interval  J'  C  J  such  that 

f(J')  -  K. 

We  finally  recall  the  following  lemma  [3]. 

Lemma  3.2.  If  /  has  a  critical  point  c0  such  that  c0  <  c_2  <  c_1(  then  /  has  an 
infinite  loop  satisfying 

Co  <  ...  <  c_n  <  ...  <  C_2  <  c_ j . 

The  same  statement  holds  with  all  inequalities  reversed. 

If  we  say  that  property  T  holds  if  /  is  turbulent,  we  have  the  following 
Theorem.  T  <=>  L(oo). 

Proof.  Let  /  be  turbulent.  We  assume  without  loss  of  generality  J  <  K.  Then 
there  are  minimal  compact  intervals  C  J  and  C  J  such  that  f{J\)  =  K 
and  /(J2)  =  J.  Since  C  =  Kf)J  and  Kf\J  is  at  most 

a  singleton,  we  conclude  from  the  minimality  of  J\  and  J2  that  Jx  f)  Ji  >s  at  most  a 
singleton  and  hence  we  have  either  Jx  <  J2  <  K  or  J2  <  J\  <  K.  In  case  J{  <  J2  <  K, 
we  conclude  the  existence  of  a  critical  point  cq  E  K  and  predecessors  c-i  E  J\  and 
c_2  E  Ji  from  the  respective  conditions  f(K )  D  K,  f{J\)  =  K ,  and  f{Ji)  =  J  D  J\. 
From  J\  <  Ji  <  K  follows  c_!  <  c..2  <  c0.  We  note  now  that  no  equality  in  the 
last  statement  can  hold  for  that  would  force  J2  to  be  a  singleton  which  is  impossible. 
Hence  /  has  an  infinite  loop  by  Lemma  3.2.  The  case  Ji  <  J\  <  K  is  similar. 
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Conversely,  if  /  has  an  infinite  loop,  then  there  is,  in  particular,  a  critical  point  c0 
and  predecessors  c_t  and  c_2  satisfying  c0  <  c_2  <  c_!  (or  c0  >  c_2  >  c_j).  We  let 
J  =  [c0,e_2]  and  K  =  [c_2,c_ij  and  verify  that  J  <  K  and  f(J)f]f(K)  3  J\JK 
hold,  i.e.,  /  is  turbulent.  This  completes  the  proof  of  the  theorem. 

We  remark  that  for  applications  the  simple  sufficient  condition  x3  <  x2  <  x0  <  xx 
and  Lemma  3.2  are  excellent  means  to  establish  turbulence. 
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ABSTRACT.  A  permutation  of  N  objects  can  be  implemented  by  a 
neural  net  which  recognizes  the  initial  arrangement  and  then  replaces  it 
with  the  final  arrangement.  This  is  the  same  process  found  in  parallel 
computer  architectures  .using  symbolic  substitution  principles,  and  in  turn 
is  the  same  function  performed  by  optical  correlator  devices.  Since  every 
group  is  isomorphic  to  a  subgroup  of  SN  ,  the  symmetric  group,  these 

interrelationships  provide  a  powerful  mathematical  base  for  optical 
computers  and  neural  networks.  Neural  nets  for  the  first  few  symmetric 
groups  are  presented.  The  number  of  interconnected  nodes  is  shown  to  be 
related  to  the  number  of  strictly  parallel  inputs  and  outputs,  and  a  group 
categorization  for  any  parallel  interconnected  network  is  discussed. 


1.  INTRODUCTION.  There  exists  a  functional  correspondence  among 
neural  networks,  optical  correlators,  symbolic  substitution,  the 
symmetric  group  SN,  and  digital  computers.  It  provides  a  mechanism  for 


the  design,  implementation,  and  use  of  paraiiel/seriai  processor 
architectures.  The  basic  common  feature  is  that  of  symbolic  substitution: 


identify  a  given  symbol  or  pattern  and  replace  it  with  another  symbol  or 


pattern.  Symbolic  substitution  was  introduced  by  Huang'  as  a  memuo 
of  implementing  digital  computers  with  optical  techniques.  The  functional 
equivalence  of  optical  correlation  and  symbolic  substitution  was  recently 
stated  by  Casasent.^  The  concept  of  a  digital  computer  as  a  finite  state 
machine  has  long  been  recognized  and  provides  a  direct  correspondence 
with  the  permutation  elements  of  the  symmetric  group  3^ .  The  ability  of 


neural  networks  to  recognize  and  replace  spatial  patterns,  and  thus 
perform  symbolic  substitution,  has  been  shown  by  C?rossberg.^  A  common 
thread  has  been  woven  through  these' five  ideas,  and  their  unification  has 
in  principle  been  achieved  as  indicated  symbolically  in  Figure  i. 
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Their  equivalence  serves  as  a  reversible  recipe  for  solving  processor 
architecture  problems,  for  designing  explicit  neurai  networks  wnich 
implement  symbolic  substitution,  for  specifying  the  optical  hardware, 
ana  for  programming  the  processor  oy  duplicating  the  group  sr-ucrure  of 
the  problem  with  the  group  structure  of  the  processor. 

1 1 .  CORRESPONDENCES .  The  following  correspondences  are 

summarized  in  figure  2. 

a.  Symbolic  Substitution.  Symbolic  substitution  consists  of 
identifying  a  given  pattern  or  symbol,  removing  it.  and  replacing  it  with 
another  pattern  or  symbol.  The  act  of  replacement  implies  the  existence 
of  gain  in  a  physical  system. 

b.  Optical  Correlators.  Consider  the  correlation  of  two 
functions,  given  by  - 

00 

C  (x.y)  =  J  J  dudv  A(u,v)  B*  (u-x,  v-y) . 

-00 

When  A~B  in  a  optical  matched  filter  correlator,  the  output  is  a  plane 
wave  which  can  then  be  brought  to  a  point  focus.  The  point 
approximates  a  delta  function  centered  on  the  location  coordinates  of  3. 
if.  on  the  other  hand.  B  is  initially  a  delta  function,  then  the  corre¬ 
lation  is  a  repiica  of  the  pattern  A.  again  centered  on  the  'ocation  of 
B^6' .  Suppose  we  have  two  correlators,  with  reference  ‘mages  f  and  g. 
respectively,  and  they  are  joined  so  that  the  output  of  the  first  one  is  • 
the  input  to  the  second.  The  first  one  identifies  all  occurrences  of  f  in 
its  input  scene  and  supplies  a  corresponding  deita  function  map  to  the 
second.  It.  in  turn,  produces  an  output  scene  witn  its  reference  image  g 
written  at  everu  'location  where  f  was  present  in  the  orioina!  scene. 

This  tandem  correlator  has  then  performed  symbolic  substitution  f-*c.  it 
recognized  the  subpattern  f  and  substituted  in  its  place  the  suDpattern  g. 
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c.  Neural  Networks.  Next  consider  the  Grossberg  neurai  moaei  of 
an  instar  triggering  an  outstar5'.  The  instar  recognizes  the 
previously-encoded  training  distribution  r  and  delivers  a  recognition 
signal  to  the  outstar.  It,  in  turn,  plays  back  a  second  previously-encoded 
distribution  g  on  its  field  of  neural  nodes.  Again,  this  is  the  function 
done  by  symbolic  substitution:  recognize  f  and  substitute  g  .  As  noted 
earlier,  the  actual  act  of  substitution  implies  gain. 

d.  The  Symmetric  Group  The  symmetric  group  SN  is  the  group  of 

ail  permutations  of  a  collection  of  N  distinct  objects.  A  permutation 
element  consists  of  an  initial  arrangement,  or  pattern,  of  the  N  objects, 
which  is  first  identified,  and  is  then  replaced  by  a  second  arrangement 
of  the  N  objects.  A  single  permutation  corresponds  to  one  act  of 
symbolic  substitution. 

e.  Digital  Computers.  A  digital  computer,  viewed  as  a  finite 
state  machine',  goes  from  its  initial  state  of  data  words  and  addresses 
to  its  final  state  by  executing  its  program.  The  mitiai  and  final  states 
are  the  initiai  and  final  arrangements  of  N  objects,  N  being,  in  this  case, 
a  very  large  number,  and  the  program  corresponds  to  the  particular 
permutation  element  which  was  used. 

III.  GROUP  NETWORKS.  In  this  example,  neural  networks  are 
devised  which  model  group  elements  and  the  group  product  rule.  It 
leads  to  a  characterization  of  the  group  in  terms  of  the  number  of  nodes 
and  input  connections  required  to  model  it.  This  characterization  is  then 
reversed  so  that  a  given  network  with  a  specified  number  of  nodes  and 
'nput  connections  per  node  can  be  identified  wli.?'  me  group  numier  N.  n. 
is  important  to  note  that  this  reverse  charaterization  is  incomplete;  it 
does  not  take  into  account  the  actual  connection  matrix.  What  it  says  is 
that  for  the  particular  number  of  nodes  and  interconnects  given,  it  would 
be  possible  to  form  the  Ntri  group,  but  does  not  necessarily  indicate  that 
such  networks  are  actually  present. 


753 


Cayley's  theorem  states  that  ail  groups  are  isomorphic  to  a  subgroup 
of  SN .  Neural  networks  which  form  all  the  permutation  elements  and 

provide  all  possible  group  products  (at  least  once)  can  accordingly  modei 
processes  described  by  a  group.  All  permutations  of  a  collection  of  N 
distinct  objects  comprise  the  elements  of  There  are  N!  elements.  The 

group  operation  is  two  successive  permutations7 


An  ideal  totally  interconnected  Grossberg  slab  (memoryless)  with 
recurrent,  shunting,  on-center/off -surround  subnets  will,  in  the  extreme 


binary  limit,  choose  the  single  most  active  node  and  drive  all  the  others  to 
zero Consider  such  a  slab  with  N  nodes.  It  can  have  N  distinct  states, 


each  consisting  of  one  active  (and  normalized)  node  and  N-l  inactive 
nodes,  if  the  nodal  output  channels  are  permuted  to  N  possible 
connections,  then  that  slab  can  perform  one  permutation. 


Figure  3  shows  the  networks  for  S 1 ,  S2,  and  S3  group  elements.  The 

Grossberg  subnets  require  that  each  node  receive  an  input  from  all  the 
nodes  including  itself,  plus  an  input  from  each  of  the  N  input  channels.9 
Some  of  the  inputs  excite  the  node  and  some  inhibit  it.  Here,  both  types 
are  simply  counted  as  inputs.  Outputs  are  not  counted.  To  form  all  the 
elements  (which,  as  yet,  are  not  connectedto  each  other)  of  SN  requires  N 

nodes  per  element  and  2  N  inputs  per  node,  mere  are  N!  elements 
The  total  number  of  nodes  is  N*N!  and  the  total  number  of  Input  connections 
is  2N2*N! .  Suppose  further  we  wanted  to  be  able  to  perform,  at  'east 
once,  all  possible  group  products.  Then  each  element's  N  outputs  must 
connect  to  the  inputs  of  every  element  including  itself.  This  is  shown  ;n 
Figure  4  for  an  element  of  the  S3  group. -This  requires  additional  input 

connections  per  element  consisting  of  N  times  the  number  cf  elements 
transmitting  to  it.  This  is  true  for  every  element.  The  total  number  cf 
input  connections  for  group  products  is  N-  (N!)2  .  These  grout)  and 
element  counts  are  shown  in  Figure  5,  where  the  standard  gamma 
function  is  shown  for  comparison.  The  overall  number  of  input 
connections  per  node  is  (2N2N!  ♦  N(N!)2  )/(N(N!))  .or  2N  ►  N! .  in 
loose  orders  cf  magnitude,  and  for  large  N  ,  the  number  of  inputs  is 
roughly  the  square  of  the  number  of  nodes. 


754 


The  number  of  distinct  objects  is  N  .  The  above  group  networks  can 
handle  them  all  in  parallel.  N  can  be  interpreted  as  the  numoer  of  parallel 
inputs  whicn  the  group  system  can  handle  in  a  fuily  inter connecteu 
manner,  and  thus  is  a  measure  of  the  parallel  capacity  of  a  memoryiess  5^ 

neural  network,  informally,  it  represents  how  many  things  the  networks 
can  do  at  the  same  time  Given  the  parallel  capacity  N,  then  the  number  of 
nodes,  element  input  connections,  and  group  product  input  connections  can 
be  listed.  On  the  other  hand,  given  the  node  and  connection  count  for  a 
network,  one  can  estimate  the  parallel  capacity,  or  group  number,  for  that 
network,  it  is  known  that  in  the  neural  cortex  there  are  about  Jr  inter¬ 
connections  for  J  neurons. '  ^  This  agrees  with  the  group  construction  For 
104  interconnections  per  neuron  in  the  cortex’^  the  parallel  capacity 
would  be  between  N  =4  and  N=5. 

Consider  the  group  networks  discussed  above,  if  giver  Dasis 's  cnosen 
for  the  permutation  elements,  such  as  the  binary  N-node  siaps,  tnen 
one  can  distinguish  between  element  operations  and  group  ooerations.  The 
element  operations  can  be  done  in  parallel.  They  can  be  caiieo  up  to 
perform  a  serial  sequence  of  group  operations.  The  elements  are  analogous 
to  subroutines  and  the  sequence  of  group  operations  is  like  an  overall 
software  program  This  gives  a  guide  from  group  tneory  to'snow  which 
operations  should  be  done  in  parallel  and  which  should  oe  done  in  senes  if 
"a  symmetric  group  computer"  is  made  capable  of  calling  up  a  desireo 
seoucnce  of  group  operations  matching  tnose  >n  the  nrcDior~  to  pe  so:vec. 
then  tnese  comprise  the  program  sequence  ann  are  done  serial iy  Tne 
individual  group  elements  are  the  subroutines  and  can  be  done  in  parallel 
using  the  neural  net  designs  of  Section  3,  and  can  be  implemented  in 
optica!  correlator  naroware  according  to  tne  functional  correspondence 
I'eviewed  in  section  2  b.  of  tnis  paper. 

A  oasic  feature  of  groups  is  that  of  closure.  Any  product  of  ’ts 
elements  yields  another  element  of  the  group,  in  principle,  tnen.  send! 
operations  should  be  unnecessary  because  one  could  simply  activate  the 
single  parallel  element  representing  the  entire  product  of  tne  ormer 
relevant  elements.  However,  for  large  and  complex  systems,  tne  problem 
of  finding  this  element  mav  reauire  takma  tne  serial  products  oT‘  Tne 
other  elements  of  the  first  place  it 's  for  tnese  systems  tnat  a 
'group  computer'  may  offer  an  aavantaoe,  not  for  those  wmen  can  pe 
nandlea  analytical1*/ 
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The  common  functional  property  f-»g  is  found 
neural  nets,  group  theory,  and  computers. 
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abstract 

Nonlinearities  in  wave  equations  lead  to  focusing  and  defocusing 
of  solutions.  Focusing  causes  sharply  defined  wavefronts.  The  interac¬ 
tion  of  such  sharply  defined  wavefronts  and  more  generally  of  non¬ 
linear  hyperbolic  waves  is  of  fundamental  importance  and  includes 
such  phenomena  as  Mach  triple  point  formation,  shock  wave  diffrac¬ 
tion  patterns  and  the  study  of  Riemann  problems  in  one  and  higher 
dimensions. 

Recent  progress  in  the  study  of  nonlinear  hyperbolic  wave  interac¬ 
tions  has  revealed  a  surprising  range  of  new  mathematical  phenomena 
and  structures.  This  mathematical  theory  should  be  useful  in  the 
design  of  improved  computational  algorithms  and  in  pan  was 
motivated  by  such  considerations.  It  is  also'  of  considerable  interest  for 
its  own  sake  as  new  mathematiral  phenomena  and  is  also  of  interest  in 
tenns  of  the  direct  insight  it  provides  into  physical  phenomena. 


1  Supported  in  pan  ts  the  Applied  Mathematical  Science'  subprogram  of  the  Oflic*  of  Energy  Research, 
L‘  S  Department  of  Energy,  under  contract  DE-ACQ2-'6ER0?0“ 

2  Supported  in  pan  b\  the  A rm;  Research  Office,  grant  DAAG29-SJ  K-0188 
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1.  Introduction 


1.1.  The  Problem  Formulation. 

We  consider  the  nonlinear  system  of  conservation  laws 

V,  +  F(V)t  =  0  (1.1) 

and  let 

X:  =  X:(l/)  s  •  •  •  s  X„(£/)  (1.2) 

be  the  eigenvalues  of  the  Jacobean  matrix 

MU)  =  *&lp- .  (1.3) 

while 

e:(U),  ■■■  ,*„«/)  (1.4) 

are  the  corresponding  right  eigenvectors.  Equation  (1.1)  expresses  conservation  of 
the  components  of  U  : 

U(x,t)dx  =  0. 

The  eigenvectors  eXU)  are  the  normal  modes  for  the  propagation  of  small  amplitude 
signals,  linearized  about  the  state  U,  while  hj(U)  are  the  corresponding  wave  speeds. 
The  Xj  are  assumed  to  be  real,  which  is  equivalent  to  the  system  (1.1)  being  hyper¬ 
bolic.  As  we  will  see  below,  this  assumption  is  not  satisfied  in  all  cases.  Similarly, 
we  say  that  (1.1)  is  strictly  hyperbolic  if  the  \t  are  real  and  distinct.  Points  U0  of 
coinciding  wave  speeds 

X,(l/0)  -  X,  *  i(l/o) 

are  called  umbillc  points.  A  remarkable  theory  has  resulted  from  the  careful  analysis 
of  these  umbilic  points. 


1.2.  Examples. 


Conservation  laws  are  basic  to  physics  and  the  equations  (1.1)  are  often  the  fun¬ 
damental  or  lowest  order  description  of  a  physical  situation.  More  refined  descrip¬ 
tions  may  arise  as  modifications  to  or  perturbations  of  (1.1).  For  example  the  right 
hand  side  may  be  replaced  by  a  diffusion  term  to  represent  transport  effects  such  as 
viscosity  or  heat  conduction.  It  may  be  replaced  by  a  source  term  of  a  geometrical 
nature,  to  represent  flow  through  a  duct  of  variable  cross  section  or  through  a  coordi¬ 
nate  system  (such  as  radial  flow  in  polar  coordinates)  in  which  volumes  and  densities 
are  not  conserved.  There  may  be  source  terms  of  a  chemical  nature  to  represent 
stored  (chemical)  energy,  not  included  in  the  state  V.  Some  refined  descriptions  will 
preserve  the  same  form  as  (1.1)  but  will  add  new  variables  and  equations,  for  exam¬ 
ple  to  represent  additional  species  in  a  chemical  reaction  or  additional  energy  parti¬ 
tion  modes  for  nonequilibrium  thermodynamics. 

The  specific  examples  and  theories  contained  within  this  framework  arc  almost 
unlimited.  The  Euler  equations  for  a  compressible  fluid  (gas)  are  in  some  sense  the 
prototype  example.  Here  the  conservation  laws  express  conservation  of  mass, 
momentum  and  energy  while  F  defines  the  corresponding  fluxes.  The  fluid  can 
describe  several  species  (multi  fluid  equations)  which  can  react  chemically  (chemi¬ 
cally  reacting  equations)  or  mix  (multiphase  flow).  Continuum  equations  of  elastic 
and  elastic-plastic  flow  arc  defined  by  conservation  laws.  Magneto-hydrodynamics  is 
a  conservation  law.  The  equations  for  saturation  and  concentration  of  fluids  in  an  oil 
reservoir  are  of  conservation  type,  as  are  the  equations  for  adsorption. 

1.3.  Theory  and  Computation  in  Mathematics 

Experimental  mathematics  refers  to  a  working  method  in  which  computer  exper¬ 
iments  play  an  essential  role  in  the  discovery  of  ideas  and  the  formulation  of  conjec¬ 
tures.  While  laboratory  experiments,  often  filtered  through  the  mind  of  a  theoretical 
physicist,  have  inspired  mathematical  thinking  for  centuries,  the  direct  use  of  experi¬ 
mental  studies  by  mathematicians  from  computer  simulations  is  a  recent  develop¬ 
ment.  There  is  no  doubt  that  experimental  mathematics  has  played  a  large  role  in  the 
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recent  progress  in  our  understanding  of  the  interactions  of  nonlinear  hyperbolic 
waves.  A  numerical  Riemann  solver  of  a  very  general  nature  for  2*2  systems  was  an 
essential  tool  in  the  development  of  insights  and  conjectures  to  guide  the  mathemati¬ 
cal  theory  [16].  The  key  tools  of  an  analytic  nature  have  been  bifurcation  theory, 
global  analysis  and  geometry. 

Equally  important  has  been  the  connection  of  nonlinear  wave  interactions  to 
applications.  In  fact  the  wave  interaction  phenomena  has  a  number  of  complex 
aspects.  Considered  for  their  own  sake,  such  problems  are  easily  put  aside  for 
theories  which  are  more  elegant  even  if  less  profound.  However  the  firm  anchor  of 
these  wave  interactions  to  such  applications  as  oil  reservoirs,  elasticity  and  chemically 
reactive  flows  have  allowed  the  focusing  of  sufficient  talent  and  energy  for  significant 
progress  to  be  made. 

1.4.  Scale  Invariance  and  Riemann  Problems. 

The  conservation  law  (1.1)  is  invariant  under  the  scale  transformations 

x,  t  -  sx,  st  ,  s  >  0  (1-5) 

in  the  sense  that 

V(x,  f)  =  U(sx,  st)  (1.6) 

is  a  solution  of  (1.1)  if  and  only  if  V  is  a  solution.  The  restriction  to  positive  s  >  0  is 
required  to  preserve  an  entropy  condition,  imposed  in  addition  to  (1.1),  for  weak 
solutions. 

It  is  natural  to  look  at  scale  invariant  data  for  (1.1)  and  the  corresponding  scale 
invariant  solutions.  These  are  called  Riemann  problems  and  Riemann  solutions 
respectively.  The  Riemann  solution  defines  the  large  time  asymptotics  of  a  general 
solution.  In  this  sense  and  using  the  language  of  quantum  mechanics,  the  Riemann 
solution  is  the  outgoing  (W*)  wave  operator  of  a  scattering  problem  [12).  In  fact  tak¬ 
ing  s  -  «  in  (1.6)  pives  scale  invariant  data 


and  formally  the  solution  V(x,  r)  is  the  infinite  scaling  limit  of  U(x.  i)  and  thus  defines 
its  large  time  asymptotic  behavior.  The  mathematical  proof  of  these  statements  as 
weil  as  the  analysis  of  further  terms  in  the  large  time  behavior  has  been  given  by  T.- 
P.  Liu  [23.24.26], 

The  limit  j  -  0  in  (1.6)  also  defines  a  Riemann  problem  and  its  solution.  This 
limit  is  the  instantaneous  response  to  jump  discontinuities  at  the  origin  in  the  data 
Uq(x)  -  U{x,  t  =  0).  This  second  interpretation  of  the  Riemann  problem  allows  the 
following  picture  of  a  general  solution.  It  will  consist  of  a  number  of  jump  discon¬ 
tinuities  (fronts)  separated  by  smooth  regions  and  possibly  smaller  jumps  which  cross 
and  interact  with  one  another  at  isolated  points.  At  an  isolated  interaction  point,  the 
solution  behavior  is  governed  by  a  Riemann  solution.  In  this  sense  the  study  of 
Riemann  problems  is  equivalent  to  the  study  of  the  interaction  of  nonlinear  localized 
waves. 

We  define  an  elementary  wave  to  be  a  scale  invariant  (Riemann)  solution  of 

(1.1)  which  also  moves  as  a  traveling  wave: 

U(x.  f)  =  U(x  -  a) 

for  some  c.  In  one  space  dimension,  the  elementary  waves  are  the  localized  waves, 
i.e.  shocks,  contact  discontinuities,  etc.,  while  in  two  space  dimensions  they  are  the 
intersection  points  of  jump  discontinuity  surfaces,  i.e.  Mach  triple  points,  etc. 

The  elementary  waves  are  the  basic  building  blocks  for  the  solution  of  Riemann 
problems;  this  is  true  in  higher  space  dimensions  as  well  as  in  one  dimension.  Thus 
elementary  waves  are  of  fundamental  importance  in  understanding  the  solutions  of 

(1.1) . 

Because  of  the  reduction  in  the  number  of  independent  variables,  elementary 
waves  in  d  dimensions  are  more  or  less  comparable  in  difficulty  to  general  solutions 
in  d  -  2  dimensions  and  Riemann  solutions  correspond  in  approximate  difficulty  to 
general  solutions  in  d  -  1  dimensions. 

In  particular  the  d  =  1  Riemann  problem  has  a  lot  in  common  with  the  theory  of 
ordinary  differential  equations  in  the  large.  Methods  such  as  global  analysis. 
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bifurcation  theory  and  geometry  are  useful  for  both  classes  of  problems. 


See  [11.12]  for  a  more  extensive  discussion  of  the  ideas  presented  m  this  subsec¬ 


tion. 


2.  Nonlinear  Resonance 


2.1.  Introduction. 

The  umbilic  points  b'0  with  X,(i7c.)  =  X.  _  :(L/0)  allow  a  degree  of  interaction  or 
nonlinear  resonance  between  distinct  modes  et (L!y)  and  e:  .  (U 0)  which  is  missing  in 
the  strictly  hyperbolic  case.  This  fact  produces  a  novel  and  rich  range  of  mathemati¬ 
cal  phenomena. 

Nonlinear  theories  can  be  divided  into  those  which  arc  qualitatively  linear  and 
those  which  are  essentially  or  globally  nonlinear.  The  central  feature  of  the 
phenomena  associated  with  nonlinear  resonance  and  umbilic  points  is  a  striking 
departure  from  the  linear  guideposts  wmcb  have  dominated  our  previous  understand¬ 
ing  of  wave  interactions. 

Let 


5  £  RK 


be  Ac  state  space  in  which  the  solution  U  €  5  takes  its  values.  The  waves  introduce  a 
type  of  coordinate  geometry  in  5.  An  umbilic  point  is  a  singularity  in  this  geometry . 
To  be  precise,  each  eigenvector  e,(U)  defines  a  vector  field  on  $.  Tne  integral  curves, 
i.e.  the  solutions  of  the  state  space  differential  equation 


dV 

dk 


=  e.(V)  , 


i  _ 


represent  both  coordinate  lines  in  S  and  in  their  x,  r  realization,  rarefaction 
which  contribute  to  the  solution  of  the  one  dimensional  Riemann  prob.cm  The  : 
realization  comes  from  setting  C  =  y  =  X.-(t')  and  observing  tha:  a  v 

C  =  l/(|)  =  l'(y)  of  (2.1)  also  solves  (1.1).  These  rarefaction  »a  e*  *-r  .a  - 
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tered  rarefaction  waves  because  they  are  constant  on  the  character, sne  lines  ?  =  y 
through  the  or, gin. 


Shock  waves  are  defined  by  interpreting  (1.1)  in  a  weak  sense,  for  example  as 
distributions  or  measures.  Then  jump  relations  are  implied  between  the  conserved 
quantities  V,  and  the  fluxes  Ft.  We  let  [«]  *  a.  =  a.  if  s  ij  a  quantity  with  a  jump 
discontinuity  across  a  curve  in  space  time  and  a-  represents  the  values  of  a  on  the 
right  (left)  of  the  curve.  Then  (1.1)  is  equivalent  to 

s[U]  "  [F]  =  0  ,  (2.2) 

as  far  as  jump  discontinuities  are  concerned.  In  particular,  two  constant  states 
separated  by  a  jump  which  satisfies  (2.2)  define  a  solution  of  (1.1). 

The  solutions  to  (2.2)  lie  in  families  and  define  a  geometry  on  S,  in  a  manner 
similar  to  the  rarefaction  waves.  The  resulting  curves  in  S  are  called  shock  curves  or 
Hugoniot  curves.  They  also  become  singular  at  an  umbilic  point. 

2.2.  The  Standard  Theory. 

The  standard  theory  for  a  single  scalar  equation  is  due  to  Oleinik  [28,29].  It  is 
a  theory  in  the  large,  but  because  of  the  restriction  to  a  single  mode,  the  geometric 
singularities  caused  by  resonances  between  distinct  modes  do  not  occur.  In  the 
Oleinik  theory,  the  solution  to  the  Riemann  problem  is  a  composite  wave  formed 
from  a  rarefaction  wave  with  embedded  jumps  (shocks)  within  it.  Considered 
geometrically  from  the  point  of  view  of  state  space,  it  is  formed  by  taking  the  upper 
or  lower  convex  envelop  of  the  graph  of  the  flux  function.  The  chords  in  the  envelop 
correspond  to  jumps  (shocks)  in  the  x,  t  space  solution  while  the  rest  of  the  envelop, 
which  lies  on  the  graph  of  the  flux  function  itself,  corresponds  to  rarefaction  waves  in 
the  x,  r  space  picture. 

The  standard  theory  for  systems  is  due  to  Lax  [19.20].  It  solves  the  Riemann 
problem  in  the  small  ( UL  ~  UR)  excluding  umbilic  points  and  assuming  a  convexity 
condition  within  each  mode 
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<V  kt(U).  * 0  . 


(2.3) 


also  known  as  genuine  nonlinearity.  Under  these  hypothesis,  there  are  no  singulari¬ 
ties  in  the  geometry  defined  by  the  rarefaction  and  shock  curves.  Starring  at  a  given 
state  L'm,  there  is  a  unique  half  of  the  shock  curve  which  is  stabieUnder  forward  time 
evolution  and  a  unique  half  of  the  rarefaction  curve  which  is  realizable  in  x .  t  space 


(because  wave  speeds  must  decrease  when  moving  to  the  left  from  t>  to  l'L  in  x.  t 
space).  These  two  half  curves  join  smoothly  and  define  the  wave  curve  through  Ug. 
There  is  one  such  curve  for  each  mode.  The  solution  of  the  Riemann  problem  is 
accomplished  by  moving  a  required  distance  and  direction  on  the  n  ,  n-l  .  .  .  ,  1  wave 
curves,  along  a  unique  path  which  starts  at  l>  and  ends  at  VL.  Each  segment  of  this 


path  lies  along  a  wave  curve  and  corresponds  to  a  shock  of  rarefaction  wave  in  x,  t 
space.  In  each  of  the  sectors  between  these  waves  the  Riemann  solution  is  constant. 

The  Lax  theory  [19]  also  allows  linearly  degenerate  families,  for  which 

<V  •  Ml/),  Ml/)>  -  0  .  (2.4) 

For  these  families,  shock  and  rarefaction  waves  coincide. 

It  is  clear  from  these  two  standard  theories  that  a  global  theory  of  the  Riemann 
problem  would  be  built  from  wave  curves  which  are  in  general  Oleinik  composite 
waves.  There  was  a  general  (and  incorrect)  impression  that  little  of  interest  would 


J 

$ 

$ 


occur  beyond  this. 

2.3.  The  Isolated  Umbilic  Point. 

The  theory  of  an  isolated  umbilic  point  is  due  to  Eli  Isaacson,  D.  Marchesin.  D. 
Schaeffer,  M.  Shearer,  P.  Paes-Lemc  and  B.  Plohr,  with  recent  contributions  by  H. 
Holden  and  C.  F.  Palmeira.  Near  an  isolated  umbilic  pom:  U0  with 
X,  ( l/o  )  =  X<_:  (  C/0 )  one  can  scale  and  blow  up  the  singularity.  This  is  equivalent  to 
replacing  the  flux  function  F(U)  by  its  lowest  order  nontrivial  terms.  We  assume 
n  =  2  and  by  a  Galelean  transformation,  X:  =  X:  =  0.  so  the  blow  up  yields  generi- 
cally  an  F(U )  which  is  a  homogeneous  quadratic  polynomial.  There  are  some 
inessential  scaling  parameters  in  me  homogeneous  quadratic  F  and  the  selection  of  a 
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unique  F  from  each  equivalence  class  is  the  problem  of  normal  forms.  It  was  solved 
b>  Isaacson,  Plohr  and  Temple  and  in  a  subsequent  and  more  satisfactory  form  by 
Schaeffer  and  Shearer  [31].  The  classification  of  the  geometry  of  rarefaction  curves 
and  a  number  of  preliminary  tools  for  the  analysis  of  quadratic  flux  Riemann  prob¬ 
lems  is  also  presented  in  [31].  The  Riemann  problems  and  normal  forms  divide  into 
four  cases  (I,  II,  HI,  IV,  roughly  in  order  of  decreasing  difficulty)  and  in  each  case 
there  is  a  symmetric  subcase  in  which  one  parameter  of  the  normal  form  is  fixed  at 
zero  and  the  resulting  Riemann  solution  is  simplified  by  an  extra  Z:  symmetry. 


The  first  Riemann  problem  of  this  class  was  solved  by  Shearer,  Schaeffer,  Mar- 
chesin  and  Paes-Leme  [34].  It  was  the  symmetric  case  I.  In  rapid  succession,  other 
cases  were  solved:  the  symmetric  cases  II,  III  and  IV  by  Eli  Isaacson.  Marchesin. 
Plohr  and  Temple  [14]  and  by  Eli  Isaacson  and  Temple  [15],  the~nonsymme:ric  (gen¬ 
eral)  cases  II  by  Shearer  and  Schaeffer  and  cases  HI  and  IV  by  Eli  Isaacson  and  Mar¬ 
chesin  [32]. 


Only  the  type  I  nonsymmetric  case  remains  open.  The  essential  ingredients 
which  allowed  the  rapid  progress  were  analytic  ideas  from  bifurcation  theory  and 
experiments  from  the  numerically  based  Riemann  solver.  It  seems  clear  that  both  the 
analytic  and  the  numerical  tools  developed  will  be  of  considerable  importance  for  the 
analysis  of  other  Riemann  problems. 


2.4.  New  Mathematical  Phenomena. 


Shock  waves  can  be  clearly  recognized  as  belonging  to  the  i!h  faoih  if  i-family 
characteristics  enter  from  both  sides  while  the  7-family,  characteristics,  j  *  k  each 
cross  the  shock,  entering  from  one  side  and  leaving  from  the  other.  Such  shocks  are 
called  stable  in  the  sense  of  Lax  or  Lax  shocks,  for  short. 


It  has  been  found  that  non-Lax  shocks  are  required  to  solve  the  Riemann  prob¬ 
lem  near  an  umbilic  point.  The  new  shocks  have  the  structure  of  an  ii  I -ay  shock 
when  viewed  from  the  left  side  and  an  »>  *  ii  Lax  shock  when  viewed  from  the  right 
side.  If  iff  >  i,,  the  shock  is  over  compressive  and  fewer  than  n  waves  occur  in  the 
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Riemann  solution.  If  «>  <  i:  the  shock  is  undcrcompressive  and  more  than  n  waves 
occur  in  the  solution  of  the  Riemann  problem.  The  undcrcompressive  shock  is  also 
called  a  crossing  shock  because  it  is  a  bridge,  joining  two  sheets  of  the  Kugonio:  sur¬ 
face.  Both  under  and  overcompressive  shocks  arise. 


Both  the  Hugoniot  and  the  wave  curves  can  have  disconnected  branchs.  This 
means  that  there  are  waves  which  cannot  be  continuously  deformed  to  zero  strength. 
This  fact  appears  to  have  been  discovered  independently  by  Shearer  and  Marchesin 
before  they  were  aware  of  one  another’s  work.  The  Hugoniot  curves  can  have  loops, 
or  points  of  self  intersection,  as  was  first  discovered  by  Shearer.  The  singularity  of 
the  geometry  of  the  wave  curves  at  an  umbilic  point  has  already  been  noted.  More¬ 
over  the  wave  curves  may  fail  to  have  a  continuation. 


One  can  regard  the  wave  curve  as  ann+1  dimensional  surface  in  S  x  5.  In  the 
standard  theory,  these  surfaces  are  globally  distinct.  Id  the  presence  of  umbilic 
points  and  especially  of  undercompressive  (also  called  crossing)  shocks  and  waves, 
we  regard  the  i‘h  wave  family  as  a  single  sheet  of  a  global  wave  surface,  as  in  the  case 
of  Riemann  surfaces.  Locally  this  surface  has  n  distinct  branches,  but  the  branches 
may  join  globally  and  may  be  deformed  onto  one  another.  Thus  the  distinction 
between  an  i  wave  and  a  j  wave  may  be  well  defined  locally  at  U  ~  £/c.  but  this  dis¬ 


tinction  is  not  globally  meaningful  in  all  cases. 


Various  topologically  significant  surfaces  in  5  and  in  5*  S  have  been  determined 
which  help  to  delineate  the  locations  of  possible  bifurcations  of  structure  in  the 
Riemann  solution.  These  are  the  inflection  locus  on  which  (2.3)  fails  and  the  convex¬ 
ity  of  a  single  mode  is  reversed,  the  b.furcation  locus  on  which  secondary  bifurcations 
of  the  Hugoniot  curves  occur,  or  in  other  words  at  which  the  Hugoniot  curves  cross. 


the  2 -sided  contact  locus  across  which  embedded  shocks  in  the  rarefaction  fans  enter 


or  leave  the  solution  and  the  hysteresis  locus  across  which  Hugoniot  curves  acquire  or 
lose  segments  of  distinct  families  or  types. 


2.5.  Elliptic  Regions. 


A  snail  perturbation  of  the  flux  F  in  a  neighborhood  of  an  umbilic  point  can 
give  rise  to  an  elliptic  region  £  Q  $  for  which  X  and  X;__  are  complex.  There  have 
been  three  major  discoveries  in  connection  with  elliptic  regions.  Holden  [13)  shewed 
that  a  Riemann  problem  with  an  elliptic  region  still  has  a  satisfactory'  mathematical 
solution.  Beil.  Trangenstein  and  Shubin  [5]  solved  such  a  Riemann  problem  numeri¬ 
cally  and  also  obtained  satisfactory  results.  Finally  Shearer  [35]  showed  that  elliptic 
re -ions  are  almost  certainly  required  on  topological  grounds  for  basic  problems  in 
petroleum  reservoir  modeling. 

How  is  this  to  be  reconciled  with  the  idea  that  initial  value  problems  must  be 
hyperbolic?  Evidentially  the  linear  instability  guaranteed  by  the  elliptic  region  £  is 
only  an  infinitesimal  instability-  and  the  problem  is  stabilized  by  nonlinear  considera¬ 
tions.  When  the  elliptic  region  i  is  bounded,  one  could  expect  the  linear  or  infini¬ 
tesimal  instability  to  cause  a  solution  taking  values  in  £  to  grow,  until  it  was  forced  to 
exit  from  £,  at  which  point  it  would  lie  in  the  hyperbolic  region  S  £,  and  be  stabil¬ 
ized.  In  fact  this  is  exactly  what  does  occur  according  to  available  evidence;  the  solu¬ 
tion.  if  forced  to  lie  in  £  will  exit  with  a  shock  and  not  return  unless  forced  to  do  so. 
More  precisely,  it  appears  that  the  wave  path  taken  by  the  Riemann  solution  will  not 
enter  £  unless  UR  €  £  or  UL  €  £■  However  the  meaning  of  this  elliptic  region  should 
be  explored  more  carefully  before  this  or  any  other  explanation  is  accepted.  There 
are  cases,  as  with  the  van  der  Waals  equation  of  state  for  a  compressible  fluid  with  a 
phase  transition  where  an  elliptic  region  results  from  an  incorrect  physical  model. 

2.6.  Open  Problems. 

Most  problems  related  to  uniqueness  are  open:  entropy,  admissibility'  condi¬ 
tions,  and  the  existence  of  viscous  profiles  are  not  satisfactorily  understood.  The 
proper  formulation  of  a  physically  meaningful  entropy  is  open.  For  the  case  of  a  sin¬ 
gle  mode,  n  =  1,  in  the  Buckiey-Leverett  equation,  a  physically  meaningful  entropy 
has  been  proposed  [4],  and  this  could  be  the  basis  of  a  physically  meaningful  entropy 
for  the  case  of  systems.  Existence  of  the  Riemann  problem  in  the  nonsymmetric  case 


I  is  open  and  a  solution  for  the  full  range  of  possible  elliptic  cases  is  open.  It  is 
likely  that  new  phenomena  will  occur  with  n  x  n  systems,  n  a  3,  but  this  case  has  yet 
to  be  explored.  The  effect  of  these  equations  on  existence  theory  for  general  data  is 
no:  known.  The  case  of  an  umbiiic  line  has  been  studied  by  Ke^f.tz  and  Kranser 
[18]  and  Eli  Isaacson  [17].  In  this  case  the  existence  theory  for  general  data  was 
solved  (for  small  data)  by  Temple  [38]  and  it  turned  out  that  some  new  ideas  were 
needed.  In  fact  the  total  variation  bounds,  which  are  central  to  the  existence  theory, 
had  to  be  reformulated  in  this  case  as  they  failed  when  applied  to  the  conserved 
quantities  U.  The  ability  of  various  finite  difference  algorithms  to  solve  Riemann 
problems  with  umbiiic  points  or  lines  is  no:  known. 

Finally  we  ask  whether  these  novel  structures  in  Riemann  solutions  have  any 
counterpart  in  experimental  science. 

3.  Riemann  Problems  for  Realistic  Equations 

3.1.  Introduction. 

Real  problems  are  often  not  strictly  hyperbolic.  They  may  fail  to  be  genuinely 
nonlinear  and  usually  must  be  considered  in  the  large.  Thus  we  can  expect  to 
encounter  the  phenomena  described  in  the  previous  section.  Here  we  explain  why 
some  real  problems  possess  special  features  which  limit  the  solution  complexity  and 
others  do  not.  Let  us  exclude  the  linearly  degenerate  waves,  which  in  many  prob¬ 
lems  do  not  give  rise  to  complex  one  dimensional  wave  interactions.  For  gas  dynam¬ 
ics,  elasticity  and  a  number  of  other  cases,  the  state  space  5  is  a  product 

S  =  C  x  h  (3.1) 

of  a  configuration  space  and  a  momentum  space.  The  eigenvalues,  when  expressed 
in  a  Lagrangian  rest  frame,  come  in  pairs  ±X;  ({/)  and  in  the  interior  of  C  are  never 
zero.  Thus  umbiiic  points  arise  only  from  coincidence  of  eigenvalues  among  the 
positive  or  the  negative  eigenvalue  families.  It  follows  that  these  Riemann  problems 
have  a  complexity  similar  to  those  of  a  Riemann  problem  for  general  systems  of  ■£[■ 


equations. 


For  systems  which  describe  species  or  concentrations,  there  is  generally  no  fac¬ 
torization  of  S  and  no  grouping  of  eigenvalues  into  disjoint  families.  Such  systems, 
which  include  oil  reservoir  flow  equations  as  an  example,  appear  to  be  as  complex  as 
the  order,  n,  of  the  system  allows. 

3.2.  Gaa  Dynamics. 

The  main  complications  of  the  wave  structure  in  the  Euler  equationsrif  compres¬ 
sible  fluids  are  those  of  n  =  1  ,  scalar,  equations,  according  to  the  5-  rule  and  the  fac¬ 
torization  (3.1)  of  the  state  space.  Each  acoustic  mode  may  develop,  through  self 
interactions,  composite  wave  structures  containing  rarefaction  waves  and  some 
number  of  embedded  shocks.  The  details  of  the  allowed  composite  waves  depend  on 
the.  equation  of  state,  and  a  comprehensive  analysis  of  this  dependence  has  been 
prepared  by  Menikoff  and  Plohr  [27].  This  means  that  qualitative  as  well  as  quanti¬ 
tative  properties  of  the  solution  depend  on  fluid  in  which  the  waves  are  propagating. 
Not  only  are  the  wave  properties  of  real  fluids  of  interest,  but  those  of  artificial  or 
simulated  fluids  defined  by  approximate  equations  of  state  are  important  also.  Id 
fact,  the  approximate  equations  of  state  are  used  in  numerical  simulations  of  fluids 
and  any  anomalous  waves  implied  by  the  use  of  such  an  equation  of  state  will  occur 
in  the  numerical  simulation  and  so  must  still  be  understood. 

There  appears  to  be  no  limit  to  the  allowed  number  of  constituent  waves 
(embedded  shocks)  in  a  single  composite  wave,  on  the  basis  of  thermodynamical 
principles.  For  many  materials,  the  change  in  convexity  occurs  at  or  near  a  phase 
transition,  and  a  simple  assumption  would  be  that  at  most  two  convexity  reversals 
(one  convexity  reversed  region)  would  be  encountered.  In  other  words,  a  given 
Hugoniot  curve  would  cross  the  inflection  locus  at  most  twice.  However  fluid  (P 
wave)  modes  arise  in  solids  also  and  a  large  number  of  distinct  phase  transitions 
occur  in  real  solids  so  this  simple  picture  would  not  be  universal. 
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Thus  the  simplest  manifestion  of  real  fluid  properties  is  the  splitting  of  a  shock 
a:  a  phase  transition,  with  a  precursor  moving  ahead  in  the  fluid  phase  and  a  second 
shock  following  in  the  vapor  phase.  Shock  splitting  of  this  type  is  well  known  in  the 
engineering  and  applied  physics  literature. 

The  Riemann  problem  for  a  relativistic  fluid  with  a  real  equations  of  state  (and 
a  phase  transition)  was  solved  by  Plohr  and  Sharp  [30].  The  phase  transition  which 
motivated  this  study  was  the  condensation  of  a  quark  •  gluon  plasma  into  a  baryon 
phase  in  proposed  experiments  for  a  new  particle  accelerator. 

In  addition  to  the  equilibrium  thermodynamical  effects  considered  here,  meta¬ 
stable  states  and  non  equilibrium  thermodynamics  are  of  interest  also.  T.-P.  Liu  [25] 
showed  how  a  nonequilibrium  partition  of  internal  energy  (vibrations  of  a  diatomic 
molecule)  leads  to  modified  fluid  equations.  It  would  be  desirable  to  supplement  the 
van  dcr  Waals  [36]  type  of  metastable  thermodynamics  by  a  more  realistic  and 
modern  treatment  of  metastable  thermodynamics.  Bethe  [6]  points  out  that  the 
equilibrium  state  behind  a  strong  shock  in  water  may  be  ice,  in  which  case  the  meta- 
scablc  water  would  be  the  preferred  solution. 

A  comprehensive  data  base  of  tabulated  equations  of  state  has  been  prepared  by 
the  Los  Alamos  National  Laboratory  for  a  wide  range  of  materials  [  1],  Using  this 
data  base,  J.  Scheuermann  has  constructed  an  efficient  Riemann  solver  for  real  fluids 
[33].  Colella  and  Glaz  [8]  had  previously  constructed  an  efficient  approximate 
Riemann  solver  for  real  fluids,  using  a  local  gamma  law  gas  approximation  to  reduce 
the  number  of  calls  to  the  real  fluid  equation  of  state.  Scheuermann’s  work  contains, 
in  addition,  an  extensive  use  of  precomputed  quantities.  In  his  approach  the  rarefac¬ 
tion  curves  are  given  by  a  single  table  look  up  and  are  thus  faster  than  the  Hugoniot 
curves  to  detenfiine. 

Phase  transitions  introduce  discontinuities  into  an  equation  of  state  just  as  shock 
waves  do  for  solutions  of  fluid  equations.  In  both  situations  interpolations  and 
numerical  differentiations  are  required  and  in  both  cases  these  operations  cause  prob¬ 
lems  when  applied  across  discontinuities.  The  front  tracking  software  (see  [7.10] 
and  related  papers)  designed  for  x,  r  space  discontinuities  works  just  as  well  for  state 
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space  discontinuities.  It  provides  support  for  interpolation  over  irregular  regions 
which  result  from  changes  of  independent  variables,  starting  with  a  rec^ngie  in  (say) 
p.  e  space  [33).  It  is  also  appropriate  for  an  accurate  representation  of  phase  transi¬ 
tion  curves  in  5. 

The  existence  of  solutions  for  the  Riemann  problem  for  a  compressible  fluid 
with  a  real  equation  of  state  follows  from  basic  physical  principles,  which  give  an 
asymptotic  description  of  the  equation  of  state  at  large  pressures  and  ensure  a  solu¬ 
tion  to  the  midstate  shooting  problem.  Uniqueness,  however,  is  not  properly  under¬ 
stood.  There  are  the  questions  of  entropy  conditions  for  complex  waves,  relaxation 
iimits,  viscous  profiles  or  other  and  distinct  physical  principles  which  may  be 
required. 


3.3.  Real  Materials. 

We  consider  here  thermo-elastic-plastic  materials  [41]  or  viscoelastic  materials 
with  a  simple  relaxation  law.  Many  common  materials,  including  metals,  are 
described  by  this  theory,  but  it  does  not  have  the  thermodynamic  universality'  of  the 
rcai  fluids  described  in  the  previous  section.  In  order  to  focus  specifically  on  the 
question  of  complex  wave  structure,  we  ignore  the  thermal  mode. 

Isothermal  elasticity  has  a  six  dimensional  state  space  with  the  structure  (3.1). 
There  are  three  coordinates  to  describe  positions  and  three  to  describe  momenta. 
There  are  three  types  of  waves.  The  first  is  a  pressure,  P-wave  or  longitudinal  wave. 
The  other  two  arc  S-wave,  Shear  or  transverse  waves.  One  of  the  S-wave  modes 
describes  torque  or  rotation  waves.  For  an  isotopic  material  (which  we  now  assume), 
there  is  no  elastic  energy  associated  with  these  rotational  waves.  For  this  reason  the 
rotational  waves  are  linearly  degenerate  and  factor  out  of  the  problem.  This  leaves 
four  modes  and  applying  the  y  rule,  elasticity  has  the  level  of  wave  structure  compli¬ 
cation  of  a  general  2x2  system. 

A  remarkable  analysis  of  the  Goursat  (half-space)  Riemann  problem  for  a  third 
order  hyperelastic  material  has  been  carried  out  by  Tang  and  Ting  [37].  Related 
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studies  of  nonlinear  elasnc  waves  can  be  traced  from  die  literature  cted  in  [37).  The 
solution  is  similar  in  structure  to  the  solution  for  an  isolated  umbilic  point  described 
in  Section  2.3  above.  In  particular  there  is  an  isolated  umbilic  point  located  on  the 
zero  shear  axis  in  the  Tang  and  Ting  solutions.  Rea  materials  fail  m  tension  and  it 
would  be  of  considerable  interest  to  determine  whether  the  Tang  and  Ting  umbilic 
point  occurs  within  the  elastic  limit  and  if  so  for  what  range  of  materials  and  strains. 

For  metals,  a  regime  of  interest  is  one  for  which  the  response  is  nearly  linear  in 
shear  and  tension,  but  becomes  fluid  like  (fully  nonlinear)  in  compression.  Thus 
aside  from  P-wave  or  fluid  nonlinearities  already  discussed,  the  most  striking  non- 
linearities  of  common  elastic  materials  occur  at  the  failure  of  the  elastic  theory. 

There  are  at  least  three  common  failure  modes  for  an  elastic  material.  These 
are  plastic  flow,  fracture,  and  collapse.  They  apply  to  ductile,  brittle  and  porous 
materials  respectively.  The  first  two  are  shear  wave  failures  while  the  third  is  a  P- 
wave  failure. 


The  theories  of  elastic  failure  are  complex,  phenomenological  and  incomplete. 
We  discuss  the  case  of  plastic  failure,  which  may  be  the  best  understood  of  the  three. 
At  a  microscopic  level,  plastic  flow  results  from  a  breaking  and  reforming  of  molecu¬ 
lar  bonds.  This  process  produces  several  effects.  Stored  elastic  (potential)  energy  is 
converted  into  thermal  energy,  with  the  result  that  elastic  forces  (stress)  are  reduced 
and  the  unstrained  reference  configuration  is  permanently  altered.  Also  dislocations 
arc  produced,  which  alter  the  material  properties  through  work  hardening.  Moreover 
the  thermal  energy  produces  heating  and  heat  softening  of  the  material. 

Making  the  Prandtl-Reuss  approximation,  we  assume  plastic  relaxation  occurs 
along  the  normal  to  the  yield  surface  in  stress  space,  and  we  represent  the  degree  of 
plastic  flow  by  a  single  scalar  variable  i}».  The  resulting  equations  are  given  in  [41], 
and  their  main  feature  is  a  new  equation  and  mode  for  «|»,  the  amount  of  plastic 
deformation.  The  purely  elastic  equations  together  with  a  nonlinear  coupling  to 
describe  plastic  relaxation  and  the  transfer  of  energy  from  elastic  to  thermal  modes 
complete  the  system.  Plasticity  should  be  contrasted  to  viscosity,  which  transfers 
kinetic  energy  to  thermal  modes. 
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3.4.  OU  Reaervotr*. 


Tbc  equations  for  the  saturation  of  oil,  {as.  and  water  id  three  phase  flow  in  an 
oil  reservoir  (porous  medium)  are  to  leading  order  a  2  x  2  hyperbolic  system  The 
equations  for  most  enhanced  oil  recovery  processes  have  the  same  form  but  usually 
have  more  equations.  Tnese  equations  were  the  motivation  which  lead  to  the  study 
of  umbtiic  points  as  discussed  in  Section  2.3.  In  particular  the  solutions  of  these 
equations  exhibit  the  complex  wave  phenomena  mentioned  id  Section  2.4  and  2.5. 

There  is  no  reason  to  believe  that  the  current  catalog  of  mathematical 
phenomena  in  the  solution  of  the  Rue-naan  problem  is  complete,  especially  as  addi¬ 
tional  processes  and  more  equations  are  considered. 

4.  Two  Dimensional  Wave  Interactions 

4.1.  Elementary  Waves. 

The  elementary  waves  in  two  dimensions  can  be  studied  by  the  same  global 
methods  which  were  used  for  Riemann  problems  in  one  dimension.  An  elementary 
wave  is  a  scale  invariant  solution  of  tbc  conservation  law  which  isstable  in  time.  It 
consists  of  angular  sectors  about  a  fixed  origin.  In  each  sector  the  solution  takes  on 
constant  values  or  is  a  simple  wave.  Just  as  the  one  dimensional  Riemann  problem  is 
a  shooting  problem  to  connect  V *  to  lL  through  a  sequence  of  elementary  waves,  the 
two  dimensional  elementary  wave  is  a  circular  problem,  to  connect  some  state  from 
one  of  the  sectors  to  itself  through  a  sequence  of  one  dimensional  waves. 

Theorem  [10).  GeDerically,  the  elementary  waves  for  a  gamma  law  gas  are  one 
of  the  following  simple  types:  cross,  overtake,  Mach  triple  point,  diffraction  and 
transmission. 

The  proof  of  the  theorem  is  based  on  the  following  considerations.  In  the 
steady  frame  of  the  elementary  wave,  one  draws  a  circle  about  the  origin.  Because 
of  the  scale  invariance,  all  of  the  analysis  can  be  reduced  to  this  circle.  Points  of  the 
circle  and  the  one  dimensional  waves  contributing  to  this  two  dimensional  elementary 
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wave  are  now  labeled  as  incoming  or  outgoing.  The  incoming  waves  are  in  principle 
unrestricted  by  the  equation  (1.1),  but  too  many  incoming  waves  are  'coincidental* 
and  nongeneric.  The  outgoing  waves  are  subject  to  an  analysis  similar  to  a  one 
dimensional  Riemann  problem,  and  only  a  limited  number  of  such  outgoing  waves 
can  occur. 

The  considerations  of  uniqueness,  real  fluid  behavior,  real  material  behaviour 
etc.  as  discussed  in  Section  3  are  important  here  also  and  are  mainly  not  resolved. 

4.2.  Two  Dimensional  Riemann  Problems. 

The  two  dimensional  Riemann  solution  could  fail  to  be  piecewise  smooth  if 
there  are  too  many  solution  modes  (n  a  3)  or  too  many  inflection  points  in  a  single 
solution  mode  [21].  However  there  should  be  only  a  finite  number  of  waves  of  sire 
greater  than  any  fixed  t  >  0.  To  focus  on  these  waves,  we  suppose  for  simplicity  that 
the  Riemann  solution  is  piecewise  smooth.  The  Riemann  solution  is  built  up  from 
elementary  waves. 

"We  introduce  reduced  coordinates 

*  v  r  -  i  ~ 

,y'  t't 

and  in  the  y,  y  plane  we  introduce  polar  coordinates 

y,  y  =  r  sin  0,  r  cos  6  . 

At  large  r,  the  conservation  law  expressed  in  terms  of  r  and  6  as  independent  vari¬ 
ables,  is  hyperbolic,  with  r  as  the  timelike  variable.  The  data  at  large  r  is  given  from 
the  solution  of  one  dimensional  Riemann  problems.  It  can  be  continued  inward  to 
smaller  r  by  the  solution  of  this  hyperbolic  equation  until  an  elliptic  region  is  encoun¬ 
tered.  Data  for  the  elliptic  region  is  specified  across  a  sonic  line  or  shock. 

Scalar  Riemann  problems  have  been  solved  mathematically  in  two  dimensions 
[18,21,22,39,40].  An  interesting  set  of  conjectures  have  been  formulated  concerning 
the  solution  of  certain  Riemann  problems  for  isentropic  gas  dynamics  in  two 


dimensions  [42]. 


Two  dimensional  Riemann  problems  arise  when  one  dimensional  waves  cross  or 
overtake  one  another  or  when  they  reflect  off  of  or  interact  with  walls  or  boundaries. 
Genetically  an  interaction  will  arise  when  two  waves  meet  or  a  single  wave  meets  a 
boundary;  it  is  such  simple  and  generic  problems  r??her  than  the  fuiiy  general 
Riemann  problem  which  should  be  studied.  Two  problems  which  have  been  studied 
extensively  on  the  level  of  experiment  and  computation  are  (a)  the  shock-wedge 
problem  of  reflection  of  a  shock  wave  by  a  wedge  in  a  shock  tube  and  (b)  the  shock 
diffraction  problem  of  reflection  and  transmission  of  a  shock  wave  by  a  contact  sur¬ 
face.  Representative  references  for  these  problems  are  (a)  [2.9]  and  (b)  [3]. 

Tnere  are  a  series  of  topologically  distinct  patterns  for  the  various  reflected, 
transmitted  and  incident  waves,  and  in  some  cases  it  is  not  known  which  pattern  is 
correct.  It  may  turn  out  that  on  the  level  of  the  Euler  equation  (1.1),  the  solution  is 
nonunique  and  will  be  uniquely  specified  only  by  the  inclusion  of  a  length  scale  in  a 
modified  theory' . 

Similar  issues  apply  to  the  interior  interaction  of  waves.  Moreover  a  two 
dimensional  Riemann  problem  can  also  be  generated  by  the  self  interactions  of  a  sin¬ 
gle  two  dimensional  elementary  wave.  In  fact  the  angles  and  wave  strengths  in  a 
stable  elementary  wave  configuration  w'ill  in  general  deform  continuously  during  time 
evolution,  and  at  some  space  time  point  the  elementary  wave  configuration  in  ques¬ 
tion  may  cease  to  exist,  either  by  a  loss  of  stability  relative  to  a  more  favored  confi¬ 
guration  or  by  a  failure  of  the  shock  polar  equations  to  have  a  real  solution.  At  this 
point  the  wave  pattern  bifurcates.  A  two  dimensional  Riemann  problem  is  defined, 
whose  solution  gives  the  bifurcation  to  a  new  pattern  formed  by  several  elementary 
waves  moving  away  from  the  bifurcation  point.  Both  the  bifurcation  point  and  the 
outgoing  pattern  of  elementary  waves  it  produces  are  also  subject  to  possible 
nonuniqueness,  and  again  a  length  scale  may  be  needed  to  resolve  this  nonunique¬ 
ness. 
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Length  scales  come  from  a  variety  of  sources.  In  the  case  of  interaction  with  a 
boundary,  there  will  in  general  be  a  viscous  boundary  layer.  For  the  interaction  of 
interior  waves,  the  viscous  effects  again  introduce  a  shock  thickness.  This  thickness 
is  normally  very  small  in  gasses  and  liquids,  but  is  more  significant  in  meta_s.  Relax¬ 
ation  of  nonequilibrium  thermodynamics  also  produces  a  icr^-h  scale  and  shock 
thickness.  For  chemically  reacting  flows,  the  reaction  zone  defines  a  length  scale, 
normally  considerably  larger  than  a  shock  thickness.  Heterogeneities  in  a  medium  or 
in  a  background  flow  such  as  small  scale  turbulence  also  provide  a  length  scaie.  Two 


dimensional  instabilities  of  a  planar  interface  may  introduce  a  complicated  pseudo- 
one  dimensional  traveling  wave  with  an  extended  thickness. 

5.  Conclusions 

There  has  been  a  recent  burst  of  progress  in  our  understanding  of  the  interac¬ 
tions  of  nonlinear  hyperbolic  waves.  This  theory  should  be  important  for  the  light  it 
sheds  on  physical  processes.  It  also  gives  analytic  or  explicit  solutions  which  can  be 
used  to  check  numerical  methods.  A  very  direct  use  for  this  theory  and  one  of  the 
motivations  for  developing  it  has  been  to  embed  the  theory  into  enhanced  resolution 
numerical  algorithms,  such  as  higher  order  Godunov  methods  and  front  tracking. 
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1.  INTRODUCTION 

A  planar,  simple  pendulum  with  a  stationary  point  of  support 
has  but  a  single  stable  equilibrium  position,  which  is  along  the 
downward  vertical.  The  dynamic  stability  of  the  inverted  pendulum 
can  be. realized  through  the  parametric  loads  that  arise  from  the 
inertia  forces  induced  in  the  pendulum  when  its  point  of  support 
is  made  to  move  in  an  oscillatory  manner.  The  horizontal 
component  of  support  point  motion  acts  as  an  ordinary  forcing 
effect  on  the  motion  of  the  pendulum.  The  vertical  motion  of  the 
support  point  is  more  interesting,  however,  since  it  acts  in  a 
parametric  manner  on  the  rotational  motion  of  the  pendulum. 
Thus,  if  the  motion  of  the  support  point  is  periodic  in  time,  the 
response  problem  then  becomes  mathematically  one  of  the  solution 
of  Mathieu's  differential  equation,  which  possesses  a  periodic 
coef  f icient . 

This  means  of  stabilizing  the  inverted  pendulum  was 


investigated  by  Stephenson  [13  to  T3]  shortly  after  the  turn  of 
the  century.  In  particular,  he  showed  that  if  the  pendulum’s 
point  of  support  undergoes  a  small  amplitude  but  high  frequency 
oscillatory  motion  in  the  vertical  direction,  then  the  inverted 
configuration  becomes  stable,  and  the  pendulum  performs  small 
oscillations  about  this  stabilized  upward  position.  Since 
Stephenson's  investigations  on  the  subject  of  induced  stability, 
numerous  other  investigators  [4]  to  [46]  have  studied  the 
possibility  of  stabilizing  the  inverted  pendulum  and  other 
related  mechanical  systems. 

Notable  among  the  early  publications  that  appeared  on  the 
possibility  of  induced  stability  is  a  paper  by  Lowenstern  [6]. 
His  approach  was  based  upon  the  introduction  of  a  new  set  of 
coordinates  for  the  purpose  of  eliminating  the  effect  of  the 
rapid  oscillations  on  those  coordinates  that  do  not  receive  the 
rapid  periodic  variations.  Then  he  considered  mean  values  of  the 
pertinent  coordinates  over  a  period  or  a  finite  multiple  of  the 
periods  of  the  imposed  oscillations.  In  Lowenstern’s  averaged 
equations.  all  the  coefficients  become  constants.  This 
represents  an  early  attempt  to  apply  an  averaging  scheme  to  the 
class  of  problems  in  which  the  equations  of  motion  contain 
periodic  coefficients.  Bogdanoff  [8]  employed  Lowenstern’s 
coordinates  and  devised  a  proof  of  the  fact  that  for  certain 
imposed  motions  the  solutions  of  the  oscillatory  coefficient 
system  and  the  constant  coefficient  system  always  remain  close  to 


on*  another.  In  Bogdanoff’s  work.  tha  parameters  of  the  system 
are  subject  to  rapid  stochastic  variation  in  time. 

In  a  somewhat  related  investigation,  Hsu  C13]  has  studied 
the  stability  of  inverted  pendula  whose  supports  are  subjected  to 
an  oscillatory  motion.  In  the  case  of  systems  of  several  degrees 
of  freedom,  he  determined  conditions  under  which  the  equations  of 
motion  could  be  decoupled  into  a  set  of  independent  Mathieu 
equations  by  means  of  a  similarity  transformation.  The  method  of 
averaging  was  not  used,  and  stability  conditions  were  obtained 
from  stability  charts. 

As  mentioned  above,  the  equation  of  motion  of  the  inverted 
pendulum  whose  point  of  support  is  driven  harmonically  in  time 
contains  a  periodic  coefficient.  Consequently,  the  condition 
that  determines  the  state  of  stability  of  the  system  is 
frequently  derived  by  means  of  Floquet  theory  (see,  e.g.,  Bolotin 
[47]).  This  procedure,  which  is  typically  complicated  and 
laborious,  can  be  avoided  through  the  application  of  the  method 
of  averaging  according  to  a  technique  described  by  Volosov  [18], 
[19].  In  essence,  the  method  of  averaging  serves  to  transform 
the  original  equations  of  motion  containing  periodic  coefficients 
into  a  simpler  set  of  equations  of  motion  that  have  only  constant 
coefficients.  With  this  simplified  system  of  equations,  it 
becomes  possible  to  apply  the  well  known  stability  criteria  for 
physical  systems  subjected  to  autonomous  loading  or  support 
conditions.  Meerkov  [163,  [17]  has  discussed  this  technique  when 
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applied  to  the  vibrations  of  mechanical  systems  with  a  discrete 
number  of  degrees  of  freedom.  This  technique  has  evolved  into 
what  is  now  sometimes  called  vibrational  control  theory.  It 
consists  of  an  analysis  of  the  averaged  equations  of  motion, 
where  the  objective  is  to  stabilize  the  equilibrium  configuration 
of  a  mechanical  system  (such  as  an  inverted  pendulum)  through 
through  the  application  of  the  appropriate  high  frequency,  small 
amplitude  motion  of  its  point  or  surface  of  support. 

The  stability  characteristics  of  a  double  pendulum  with 
elastic  hinges  that  is  subjected  to  a  constant  follower  force  of 
magnitude  P  were  studied  by  Ziegler  [481.  He  determined  that 
such  a  non-conservative ly  loaded  system  becomes  unstable  by 
flutter  when  the  critical  value  of  the  load  is  exceeded. 
Herrmann  and  Bungay  [49]  examined  the  stability  of  Ziegler’s 
pendulum  but  assumed  that  the  direction  of  the  applied  load  P  was 
determined  by  a  parameter  o i  called  the  tangency  coefficient. 
When  d  »  0,  the  load  is  a  purely  conservative  force,  and  when  *  a 
1  it  is  a  tangential  or  follower  force.  If  *  <  0,  the  force 
is  termed  ant i - tangent i al .  if  0  <  d  <  1  sub- tangential .  and  if 
A  >  1  super- tangential .  The  system  becomes  unstable  by 
divergence  or  flutter  depending  upon  the  value  of  the  tangency 
coefficient.  Later,  Herrmann  and  Jong  [50]  considered  the  same 
problem  for  the  case  in  which  the  influence  of  viscous  damping  in 
the  hinges  was  also  included.  Tso  and  Fung  [51]  subsequently 


investigated  the  parametric  instability  of  Ziegler’s  pendulum 


under  the  combined  actions  of  a  purely  tangential  force  (o l  3  1) 
and  sinusoidal  base  motion.  In  particular,  they  determined  the 
conditions  for  instability  in  the  non-resonance,  parametric 
resonance,  and  combination  resonance  cases. 

In  the  present  investigation,  Ziegler’s  pendulum  with 
elastic  hinges  and  subjected  to  an  externally  applied  force  of 
magnitude  P  whose  orientation  is  specified  by  the  tangency 
coefficient  o(  is  again  considered.  However,  the  base  upon  which 
one  extremity  of  the  pendulum  is  elastically  restrained  is 
made  to  undergo  a  sinusoidal  oscillation  along  the  undeformed 
axis  of  the  system.  It  will  be  assumed  here  that  the  base 
motion  is  of  small  amplitude  and  high  frequency.  The  goal 
is  to  stabilize  the  system  by  means  of  vibrational  control,  i  .  e .  ,. 
high  frequency,  low  amplitude  base  motion.  The  equations  of 
motion  are  linearized  relative  to  the  undeformed  equilibrium 
configuration  of  the  system,  and  the  method  of  averaging  is 
applied  to  the  system  of  equations  containing  periodic 
coefficients  in  order  to  generate  a  simpler  and  more  convenient 
system  of  equations  with  constant  coefficients.  This  latter 
system  serves  for  the  purpose  of  easily  computing  the  critical 
flutter  or  divergence  loads  for  the  system  as  a  function  of  the 
tangency  coefficient. 
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2.  REVIEW  OF  THE  METHOD  OF  VIBRATZOHAL  CONTROL 


Before  the  question  of  determining  the  possibility 
of  stabilizing  Ziegler’s  pendulum  through  the  action  of  high 
frequency,  low  amplitude  base  motion  is  addressed,  it  is 
worthwhile  to  consider  the  method  of  vibrational  control  as  it 
may  be  applied  to  the  general  class  of  discrete  physical  systems 
whose  motions  are  described  by  a  system  of  second  order  linear 
differential  equations  with  periodic  coefficients.  In  this  way, 
the  requirements  that  the  system  must  satisfy  in  order  that  the 
method  of  vibrational  control  can  be  applied  will  be  exposed. 

Consider  the  system  of  differential  equations 


?(t)  +  €D(t)  W(t)  +  C€2B  ♦  €q(t)JV(t)  =  0, 


(2.1) 


where  £  is  a  small  parameter  and  D(t)  and  q(t)  are  known  periodic 

***  A# 

n  x  n  matrices.  Moreover,  it  is  assumed  that, 


D ( t )  =  D  +  D, (t) 


and  that  D. (t)  and  q(t)  have  zero  mean  values,  i.e 
"1  i 

j  D1  (t)  dt  =  0 . 


M(D  (t) >  =  lim  ( 1/T) 
*+  1 


M(q(t)}  =  lim  (1/T)  f  q(t)  dt  =  0. 

o 


(2.2) 


(2.3) 


In  order  to  apply  the  method  of  averaging.  Equation  (2.1) 
must  first  be  transformed  into  a  suitable  form,  namely,  a 
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particular  type  of  system  of  first  order  ordinary  differential 


equations.  For  this  purpose,  it  is  convenient  to  introduce  the 
following  transformations: 

fit)  =  y  ( t)  ♦  6Q(t)yr(t)  .  fit)  =  £  z  ( t )  +  €Q(t)£(t)  ,  (2.4) 

where  Q(t)  is  an  arbitrary  n  x  n  matrix  that  must  assure  the 
periodic  character  of  ?(t).  A  condition  that  will  serve  for  the 


determination 

of  Q(t) 

will 

be 

introduced  later. 

It  is  also 

assumed  that 

the  mean 

value 

of 

the  matrix  §(t) 

shall  vanish: 

M{Q ( t)  }  = 

/w 

lim  (1/T) 

T-»oo 

[  Q (t)  dt  =  0 . 

Jo  ~ 

(2.5) 

A  differentiation  of  in  Equation  (2.4)  yields 
g  =  J  +  €QX  >  =  6Z  ^  €<&. 

whence 

(I  +  6 Q)  y  =  £z, 

or ,  upon  rearrangement , 

y  =  6Rz,  (2.6) 

where 

R  =  ( I  +  6Q)  ”  1  .  (2.7) 

+J  r* 

with  I_  denoting  the  identity  matrix.  The  derivative  of  the  f 
expression  in  Equation  (2.4)  is 
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«  €z  ♦  6Q^  +  €si- 


(2.8) 


Substitution  of  Equations  (2.4)  and  (2.8)  into  Equation  (2.1) 


leads  to 


z  =  -  (Q  +  q ) y  -  Qy  -  £ [ DO  +  B  +  qQ  +£BQ]y  -  6Dz. 

“  /v  -v  /v  »Vv  *w'v*  ^  ^  /v  ^  ^ 


(2.9) 


Eliminating  y  between  Equations  (2.6)  and  (2.9),  one  obtains 


2  =  -(Q  +  q)y  -  6(8  +  qQ  +  DQ  +  6BQ)y  -  €(D  +  QR)  z 

^  ^  ^  >V  *vArf  /vV  ^  ^  *#/V  M 


(2.10) 


To  this  point,  the  matrix  Q(t)  has  been  assumed  to  have  zero  mean 


value  but  has  otherwise  been  left  completely  arbitrary.  At  this 


point,  it  is  now  required  that 


Q  ( t )  +  q(t)  =  0, 

/V  Z 


(2.11) 


so  that  Equation  (2.10)  assumes  the  form 


z  =  -  6  Uy  -  eVz , 

^  W  V/I# 


(2.12) 


where 


U  =  B  +  qQ  +  DQ  +  €  BQ , 

^  A/ 


(2.13) 


V  =  D  +  QR. 

as  a*  as  as 


(2.14) 


Equations  (2.6)  and  (2. 12)  are  in  the  general  form  to  which 


the  method  of  averaging  is  applicable,  namely, 


y  =  6Rz ,  2  =  -  €Uy 

A*  A*  A*  /V  AJ  AS 


-  evz . 


(2.15) 


£  . 

' W/V  V/'/  V’/»  V.V  V.V.V.V’^.V.V-.-.V.V.-.-/.- --.v.v.v. 


V.v.v  VWV, 


.  -\  -  VY/V.y.y^y  .r 
vvv.V.v.v.  .-'vv 


The  theoretical  foundations  of  this  technique  are  discussed  in 
References  [16]  to  [21]  and  [52].  The  average  values  of  the 
matrices  R,  U,  and  V  are  defined  to  be 

r*  /v  v 

R*  =  M(R}  ,  U*  =  M{U),  V*  =  M{V)  ,  (2.16) 

^  ^  /y  -v  <v 

where 

M(R)  =  lim  (1/T)  f  R  dt .  (2.17) 

T  K® 

etc.  Substitution  of  Equations  (2.7),  (2.13),  and  (2.14)  into 

Equation  (2.16)  yields 

R*  =  M{(I  +  £Q)-1),  (2.18) 

/v  ^  a# 

U*  =  B  +  M(qQ)  +  M{ Djg >  +  € BM( Q)  , 

=  B  +  M(qQ)  ♦  M{D  Q)  ,  (2.  19) 

V*  =  M(D)  ♦  M(QR) 

•v  <v 

=  D  ♦  M(QR)  ,  (2.20) 

"0  '•*' 

where  D  =  Dq  *  D^(t),  Equation  (2.2),  and  M( Q ( t ) )  =  M(Q ( t ) )  =  0 
have  been  used.  Now  the  averaged  forms  of  the  differential 


equations  in  Equation  (2.15)  are 


whence 


Z  +  6V* z  +  g^U*R*z  =  0.  (2.21) 

This  is  the  important  system  of  differential  equations  that  the 
method  of  averaging  generates.  It  is  related  to  the  original 
system  in  Equation  (2.1)  that  possesses  time  dependent  (indeed 
periodic)  coefficients.  Even  though  this  new  system  in  Equation 
(2.21)  has  constant  coefficients,  it  still  serves  to  furnish 
important  information  regarding  the  state  of  stability  of  the 
original  physical  system  under  consideration.  In  the  sections 
that  follow,  the  analysis  described  will  be  based  upon  the 
consequences  of  this  simplified  system  of  ordinary  differential 
equations . 

It  is  possible  to  integrate 
a  first  integration  leads  to 

Q(t)  =  Q ( 0 )  - 

A  second  integration  gives 

Q(t)  =  Q(0)  ♦  Q  (0)  t 

+*  /V  ** 

=  g(0)  ♦  g(0) t 

Taking  the  averages  of  Equations 


Equation  (2.11)  twice.  Thus, 


f  q(x 

A 


)  dx. 


(2.22) 


I  q(x)  dx  dy 


-r 


.y* 

0  o 

( t  -  x) q (x)  dx . 


(2.23) 


(2.22)  and  (2.23)  ,  one  finds 
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that 


Q(0) 

ft- 

=  M( 1  q (x) 
J0  ~ 

dx)  = 

M(  (T  - 

t)  q  ( t)  ) 

A# 

=  lim  ( 1/T) 

f<T  - 

t)  q  ( t) 

dt 

(2.24) 

T*#* 

3o 

and 


.  r* 

Q( 0)  =  -M{Q (0 ) t  -  (t  -  x)q(x)  dx) 

~  ~  it  ~ 


-(1/2) lim  (1/T){Q(0)TX 

A* 

T>«* 


£ 


(T  -  t)  q  (t)  dt)  .  (2.25) 


Once  the  constant  matrices  Q(0)  and  Q(0)  have  been  evaluated  from 

/V  AS 

Equations  (2.24)  and  (2.25)  ,  respectively,  the  expression  for 

<J(t)  becomes  completely  known  from  Equation  (2.23). 

Example .  Let  Q(t)  =  t Q .  .  ( t )  ]  and  q(t)  *  [q .  .  ( t )  ] ,  where, 

l  J  ~  l  j 

for  i.j  =  l(l)n, 


q  .  (t)  = 
Mi  J 


i .  .  sin  t  ♦  r.  sin  (i/  t)  , 
U  1J  u 


(2.26) 


the  quantities  s^,  iv  ^  ,  and  j  being  known  constants.  The  goal 
is  to  determine  £(t)  from  Equation  (2.23). 

According  to  Equations  (2.23)  and  (2.26),  it  follows  that 


\  (t  -  x)  sin  ( J.  .  x)  dx  =  ( 1  /j/  .)(£/..  t  -  sm  u  .  .  t )  , 

\  ij  ijij  1 j 

Jo- 

so  that  Equation  (2.24)  leads  to 

Q .  .  ( 0 )  =  lira  (1/T)  (  (T  -  t)[s..  sin  t  +  r..  sinO/.  t)]  dt 
1J  Jo  13  1J  1J 


=  1  im  (1/T)  [  s  .  (T  -  sin  T)  +  (r  .  .  /u  )  (1A  T  - 
T>-  1J  1J  1J  1J 


-  sin  (j/.  .T)  ] 
i  J 


=  s .  +  r 

U  l  J  i  J 


(2.28) 


To  evaluate  Q  (0")  in  Equation  (2.25),  the  following  integral  is 
required : 

r  2  rT  -2 

\  (T  -  t)  q-.  .  (t)  dt  =  s..  \  (T  -  t)  sin  t  dt  + 

Jo  1J  1J  Jo 


s  .  .  (T2  +  2  cos  T  -  2)  +  (r .  .  /if3  .  )  [j;*  T2  + 
U  U  iJ  ij 


+  2  cos  (l/.  .  T)  -  2]  . 
i  J 


(2.29) 


Hence,  from  Equations  (2.25)  and  (2.29) ,  it  is  found  that 


Q  .  ( 0)  =  -lim  ( 1  /  T)  [  s  .  .  (  1  -  cos  T)  +  ( r  .  .  /  j/  .  .  )  (  1 
1J  1J  1J 


cos ( if  . T) ]  =  0 . 

i  J 


(2.30) 


Therefore,  inserting  Equations  (2.28)  and  (2.30)  into  Equation 


(2.23)  .  one  has 


Q..(t)  =  (s  .  .  *  - r .  ./U.  )  t  -  \  (t  -  x)  [s  sin  x  + 

u  u  ij  u  1  U 


+  r..  s in  (JA  .  x)  ]  dx 
i  J  i  J 


i .  .  sin  t  ♦  ( r  .  .  /i/  *.  )  a  i  n  ( l> .  .  t )  . 
1J  xj  xj  U 


(2.31) 


Consequently,  given  the  elements  q^Ct)  in  Equation  (2.26),  the 
corresponding  elements  Q  (t)  are  determined  to  be  those  stated 
in  Equation  (2.31). 
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3.  THE  EQUATIONS  OP  MOTION 

Consider  now  Zielger’s  double  pendulum  mounted  on  a  movable 

base  of  mass  m^  .  The  base  moves  as  a  rigid  body  in  the  x-  and  y- 

directions  as  shown  in  Figure  1.  The  xy-axes  remain  stationary, 

whereas  the  x’y’-axes  translate  with  the  movable  base.  The 

coordinates  of  the  support  point  are  (x  (t) ,  y  (t) )  relative  to 

o  o 

the  stationary  origin  0.  The  mathematical  model  to  be  used  here 
parallels  that  described  by  Herrmann  [53].  The  double  pendulum 
consists  of  massless  rods  that  carry  masses  m^  and  m^  that  are 
concentrated  at  their  extremities.  The  rods  are  of  length  £  . 
Linear  elastic  hinges  are  present  in  the  joints  of  the  system. 
The  associated  rotational  spring  constants  designated  c.  and  c„. 
The  force  of  gravity  acts  in  the  direction  of  the  negative  y- 
axis.  A  force  P  of  the  follower  type  is  applied  at  the  free 
extremity  of  the  double  pendulum.  The  parameter  Of,  called  the 
tangency  coefficient,  measures  the  degree  of  deviation  of  P  from 
the  vertical  direction.  A  force  F  defined  by 


F 


F  i 
x  — 


+  F  i. 
y 


(3.1) 


is  applied  at  O’  to  cause  the  base  of  the  system  to  translate  in 
the  xy-plane.  In  Figure  1,  g  denotes  the  acceleration  of 
gravity,  whereas  cp^  and  <p  represent  the  angular  displacements  of 
the  two  rods  in  the  pendulum  relative  to  the  vertical  axis. 

For  the  purpose  of  deriving  the  equations  of  motion,  Kane's 
method  has  been  used.  It  consists  essentially  of  employing  the 
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asSES® 


iWivivi 


expression 


K  +  K  =  0,  i  =  1,2,3, 


(3.2) 


where  K  ^  and  are  the  generalized  inertia  forces  and 


generalized  active  forces,  respectively.  Since  Kane’s  method  is 


described  in  considerable  detail  in  References  [54]  to  [56],  the 


various  steps  in  the  derivation  process  will  not  be  reported 


here 


The  equations  of  motion  for  the  physical  system  depicted  in 


Figure  1  can  be  shown  to  be 


(m,  +  m„  +  m_)x  -  ( 


s  in  U) ,  ♦  (m 


2  +  m3)£^l  COS<Pl 


2  •• 

n3J0<j>z  sin<p2  +  MzVp2  cos<f2  +  P  sin«X<p2)  *  Fx  . 


(3.3) 


«  ’  •  2  •« 

+  m2  +  m3)y  -  (m2  +  cos^  -  (m2  +  mz)SLZpl  sin^ 


2 

-  CO s<p2  -  m32^*2  sxn(f2  +  (n^  +  m2  +  m3)g  + 


+  P  cos  ( )  =  F 


(3.4) 


(m2  +  m3)ix  cos<p1  -  (m2  +  m3)iy  sin^  +  (m2  +  m;j)£2^1  + 


J<>2  3in  (<?1  "9V  +  m3^92  COS(<?l  •  ?2)  ’ 


-  (m2  ♦  m3)ig  sin?1  ♦  (^  *  c 2)<pl  -  cfa 


-  Pi  sin  (<pi  -  «<jc>2)  =  0 , 


(3.5) 


jj?x  cos (f>2  -  m3£y  sinp2  -  sin(<pi  -  <jp 2 ) 


.V 


R 


; 


1 


*' 

1 


3T 

a 

i 


i 

i 


K 


§ 


IK 

2 


y 


$ 

£ 


S’ 

Ssi 


M 

SI 

§ 

v 

S 


i 


(3.6) 


♦  =2(?2  "9V  ~  ?JL  Sin  (1  ~*)V2  =  °‘ 

Suppose  that  the  motion  of  the  double  pendulum  is  such  that 
l<jp^l  <<  1  for  j  =  1,2,  i.e.,  it  undergoes  small  oscillations 

about  the  translating  vertical  axis  y’.  In  this  situation  the 
non-linear  differential  equations  in  Equations  (3.3)  to  (3.6)  can 
be  linearized  relative  to  <fi  -  V2  =  0.  The  results  of  the 
linearization  are  the  following: 

(m:  +  m2  +  m3)x  ♦  (m2  +  m.jJjl^  +  m3^2  +  P<X92  =  Fx  ’  (3.7) 

(m^  +  m2  +  m3>  y  +  (m^  ♦  m2  +  m3>  g  ♦  P  =  F  ,  (3.8) 

<m2  ♦  m 3)ix  +  (m2  ♦  m z^2(pl  *  n»3tf2£2  *  ^C1  +  C2  ~  p£“ 

-  (m2  -+  m3 )  £  ( g  +  y)  ](pl  +  (Pjloc  -  <=2^2  =  °‘  (3.9) 

m3£x  +  m3^*1  *  m3£2^2  -  c^  -  Cc2  -  P* <  1  -  * )  - 

-  m3 $  (g  +  y)  ]Cpa  =  0  .  (3.10) 

Next  let 


m  =  <?m  (3.11) 

where  O’  is  a  parameter  and,  as  was  done  in  Reference  [53], 

m2  =  2m.  m3  =  m,  and  c^  =  =  c.  (3.12) 

Morever,  if  F  and  F  are  given  as  explicit  functions  of  time  t, 
x  y 
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then  Equations  (3.7)  to  (3.10)  represent  four  differential 
equations  in  the  four  unknowns  x,  y,  <p^  ,  and  <p2  ■  Suppose  further 
that  the  translational  displacements  of  the  point  of  support  x(t) 
and  y(t)  are  known  explicitly,  namely, 

x(t)  =  0,  y(t)  =  yQ  sin-Q^t.  (3.13) 

Under  these  assumptions,  Equations  (3.7)  to  (3.10)  assume  the 
f  orms 

Fx  =  3™^  *  miy>2  +  p*y2’  (3.14) 

F  =  (3  +  O’ )  mg  +  P  -  (3  +cy)y  Q2  sin  XI.  t,  (3.15) 

y  o  *1  1 

and 

-a 

Zml2^  *  m£2^D2  +  C 2c  -  PjZ  -  3mi(g  -  y QXlJ  sinX^t)]^ 

+  (Pj£oC  -  c)9>2  =  0  ,  (3.17) 

mf2(jp^  +  mt2(p2  -  c^1  *  [c  -  Pj?  (  1  -  OO  - 

-  m£(g  -  yQ/l2  s  in  t)  ]<p2  =  0.  (3.13) 

The  differential  equations  in  Equations  (3.17)  and  (3.18)  are 
linear  and  homogeneous.  It  is  important  to  note  that  each  of 
these  equations  contains  a  term  in  which  a  periodic  coefficient 
appears.  For  the  determination  of  the  state  of  stability  of  the 
v i brat i onal 1 y  controlled  system,  it  suffices  to  consider  only 
Equations  (3.17)  and  (3.18).  These  equations  are  particular 


cases  of  the  more  general  system  of  equations  stated  in  Equation 


(2.1) .  However,  if  a  small  parameter  can  be  introduced  into 
the  equations  stated  above,  then  the  method  of  averaging  can  be 
applied,  and  the  averaged  system  of  equations  in  Equation  (2.21) 
can  be  employed  for  the  determination  of  the  conditions  for 
stability  of  the  non-conservatively  loaded  double  pendulum  shown 
in  Figure  1.  It  is  much  more  convenient  to  deal  with  the 
averaged  equations  than  with  those  stated  in  Equations  (3.17)  and 
(3.18)  . 

In  the  event  that  g  =  =  0 ,  Equations  (3,17)  and  (3.18) 
become  identical  to  those  considered  by  Herrmann  and  Bungay  [49]. 
It  may  be  observed  that  the  dimensionless  parameter  c r  associated 
with  the  mass  of  the  movable  base  does  not  appear  in  the  central 
differental  equations  in  Equations  (3.17)  and  (3.18). 

It  is  convenient  to  put  .Equations  (3.17)  and  (3.18)  into  a 
dimensionless  form.  For  this  purpose,  the  following  quantities 
are  introduced: 

*  -  il1 1,  6=  yQ/£,  col  =  g/£. 

_ 2  2  _  9  2  C3‘  191 
w  =  0)ojP/-O1yo  .  r  =  c/(mXi1y0),  =  PJJ  /  ( m  fl^y* )  . 

Using  Equation  (3.19)  and  combining  Equations  (3.17)  and  (3.18), 
one  can  easily  express  the  dimensionless  forms  of  the  system  of 
differential  equations  under  consideration  in  the  form  required 
in  Equation  (2.1) ,  namely , 
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T(«)  +  Ce2B  +  eg .(t)  ]£(«)  =  o,  t?  =  dv/dti, 


(3.20) 


since  D(*)  =0,  i  .  e .  .  no  damping  terms  have  been  included  in  the 
present  analysis,  where  the  elements  of  the  B  and  g_(T)  matrices 


B,,  =  (3r  -  Q  -  3«2)  /  2  ,  B.„  =  (-2r  +  Q  +  u>2)/2, 

11  O  12  O 

2  (3‘21) 

B-,  =  ( -  5r  ♦  Q  ♦  3w  1/2,  B  „„  =  C4r  +  (2ol  -  3)Q  -  3U  ]/2. 

Z  1  O  /  -o  o 


S11  =  3/2'  S 1 2  =  -1/2'  S  2  1 


■3/2,  s  j  2  =  3/2,  (3.22) 


wi  th 


g/T)  -  s  sinT, 


(3.23) 


Equation  (3.20)  is  in  the  canonical  form,  so  t,hat  tiie  theoretical 
results  developed  in  Section  2  can  now  be  applied  for  the 
specific  forms  of  the  quantities  stated  in  Equations  (3  21)  to 


(3.23)  . 


The  system  of  differential  equations  in  Equation  (3.20) 


desribes  the  small  motions  of  a  non-conservatively  loaded  double 
pendulum  whose  point  of  support  is  driven  in  a  sinusoidal  manner 
along  a  vertical  axis.  Restoring  moments  are  exerted  at  the 
hinges  located  at  the  point  of  support  and  at  the  joint  of  the 
double  pendulum.  The  intention  here  is  to  show  that,  when  the 
point  of  support  is  driven  at  small  amplitude  and  high  frequency, 
the  critical  value  of  the  applied  force  can  be  raised,  i.e. ,  the 
system  can  be  stabilized. 


v>v-v 


4.  THE  AVERAGED  EQUATIONS  OF  MOTION 


In  lieu  of  working  with  Equation  (3.20)  ,  which  has  periodic 
coefficients,  it  is  desirable  to  determine  the  explicit  form  of 
the  differential  equation  in  Equation  (2.21)  that  has  constant 

ft  ft  ft 

coefficients.  To  determine  the  quantities  V  and  UR,  it  is 
first  necessary  to  evaluate  Q(V) .  This  has,  in  fact,  already 
been  accomplished  in  the  Example  in  Section  2.  Equations  (3.23) 
and  (2.26)  are  equivalent  if  r _  =  0  for  all  i  and  j  considered. 
It  follows  from  Equation  (2.31)  that 


Q(T)  =  sq(r) 


(4.1) 


where 


q  (*t )  =  sinT. 


(4 . 2) 


It  is  easily  shown  from  Equations  (2.18)  and  (4.1)  that 

R*  =  M(  ( I_  ♦  CsqCt))-1}  =  M((I  +  €  s'  lq  (T)  ) /G  (X  )  }  ,  (4.3) 


where 


G(T)  =  1  +  3€q(T)  ♦  3£2q2(T)/2 


(4.4) 


Then  from  Equations  (2.20)  and  (4.1)  to  (4.4),  one  finds  that 
X*  *  sM(q  (H)  R)  =  sM { q  (T )  [  I_  ♦  6  s"  1  q  (T  )  1  /G  (T )  } 


=  sM{q(*t)  KIT)}  +  6M{q('t)q(T)  /GOT)  )  =  0 


(4.5) 


s  i  nee 
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MCq(T)/Q(T)>  =  lim  -  \  - - - 

T»°°  T  J-  1  +  3eq  +  36  qa 


q  /  2 


1  1  [*2  +  €(3  +  ^3)  q  (T) 

-  lim  -  In  ■  . . —  =  0 

T>°°  T  [_ 2  +  €(3  "  ^3)q(T)  . 


M{q(tr)q(T) /G(T)  }  =  lim 

T>» 


i  fT  qq 

tJ„  1  *  3£q  +  3eaqi 


q  /  2 


=  ---  lim  -  In [ 1  +  3eq(T)  +  36  q  lT)/2]  - 


36  T+«o  T 


-  M{q  (-t)  /G(T)  }  =  0  . 


Now;  Equation  (4.3)  yields 


R*  =  I_M  (  1  /  G  (T )  }  ♦  €s-1M(q(T)/G(T)  ) 


I  -  6(31.  -  s“L)M(q(T:))  +  6^  [  (15/2)1. 


3s"  1  ]M(q2  CT)  )  -  e3[  181. 


(  15/2)  s"  1  ]M(q3  (T)  )  +  e.4[(279/4)I_ 


18s' 1 ]M(q4 (X) }  +  0(65) 


(4.6) 


in  view  of  Equations  (4.2)  and  (4.3)  ,  where  the  right  side  is  the 


result  of  expanding  the  quantity  1/G(T)  in  terms  of  the  small 


parameter  € •  8ut 


ft; 


i 


i 


I 


8 


hi 


RK 


HOvV.’ 


q(*t)  =  si  n't. 


q  (X) 


cos  2T) / 2 , 


=  (  1  - 


q^Ct)  =  (3sin't  -  sin  3”t)/4,  q4  (X)  =  (3  -  4cos  2*t  +  cos  4f)/Q 


so  that 


M{  q  (*t )  }  =  M(q  (*t>)  =  0, 


M{q2(T)>  =  1/2,  M(q4  ("t )  }  =  3/8 


Therefore,  Equation  (4.6)  is  reduced  to 


R*  =  I_  +  €2[(15/2)I_  -  3  S_~  1  ]  /  2  +  3  £4  [  (  279/4  )  I_  - 


18s  1 ]/8  ♦  0(£5)  . 


(4.7) 


Because  D^fr)  =  0  and  by  virtue  of  Equation  (4.1),  Equation 


(  2 . 19)  becomes 


U*  =  B  +  s2M(q2 (T) } 


=  B  +  s  /2 . 


(4.8) 


Therefore,  the_  product  UR  to  a  first  approximation  is,  in 
view  of  Equations  (4.7)  and  (4.8) 


U*R*  =  B  +  s2/2  +  0(e2). 


(4.9) 


Neglecting  the  j  -term  in  Equation  (4.5) ,  one  has  as  the  specific 
form  of  Equation  (2.21)  for  the  problem  under  consideration 


z  +  g  Az  =  0  , 


(4 . 10) 
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s 


where 


A  =  B  +  3/2 


(4.11) 


with  the  matrices  B  and  s_  being  defined  in  Equations  (3.21)  and 
(3.22).  The  elements  of  the  matrix  A  are  obviously  constants; 


specifically,  the  A^  are 


A.  .  =  (3  +  3r  -  Q  -  3<j)  )  /2  ,  A,, 

ll  o  l  z 


(3/2  +  2r  -  Q  -  u»  )/2, 
o 


A_ ,  =  -(9/2  +  5r  -  Q  -  3w  )/2, 
2  1  o 


(4.12) 


A22  =  [3  +  4r  +  (20t  "  3 )  Qq  -  3(d  1/2, 


by  virtue  of  Equations  (3.21)  and  (3.22) 


m 


$ 

? 

ft 


.  *  Mill’ 


3.  STABILITY  ANALYSIS 


The  determination  of  the  state  of  stability  of  the 


sinusoidally  driven  Ziegler's  pendulum  is  based  upon  a  careful 


examination  of  the  eigenvalues  associated  with  the  system  of 


differential  equations  in  Equation  (4.10). 


A  solution  of  Equation  (4.10)  is  sought  in  the  following 


f  orm: 


z  =  ae 


(5-  1) 


where  a  is  a  constant  column  vector,  i  =  (-1) 


and  “X  is  the 


eigenvalue  (i.e.,  the  dimensionless  natural  frequency  of  the 


system)  to  be  determined.  Substitution  of  Equation  (5.1)  into 


Equation  .(4.10)  yields  the  system  of  homogeneous  algebraic 


equations 


(7?  I  -  62A)a  =  0, 


(5.2) 


which  has  a  non-trivial  solution  of  and  only  if 


Det(A2I  -  €2A)  =  6, 


whence,  upon  expansion, 


>  -  G.  (A  +  A22^  +  6  (A11A22  "  A12A21)  ~ 


(5.3) 


where  the  A^’s  have  been  given  in  Equation  (4.12).  In 


particular,  Equation  <5. 3)  can  be  expressed  as 
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X4  -  (62/2)C7r  +  6(1  -  <i)2)  +  2«X  -  2)Q  ]  X2  + 

o 

+  (64/4)(2(l  -  o<  )  Q2  +  2  [cd  2  -  3(1  -  <*.)  (1  + 

o 


+  r  -  CO2)  ]Q  +  2r2  -  lOrcJ2  +  6o>4  +  9r/2 
o 


-  9co  +  9/4}  =  0. 


(5.4) 


The  forms  of  the  dimensionless  parameters  in  Equation  (3.19)  were 
introduced  for  the  purpose  of  expressing  the  system  of 
differential  equations  in  Equations  (3.17)  and  (3.18)  in  the 
canonical  form  of  Equation  (2.1).  Now  it  is  desirable  to  select 
a  new  set  of  parameters  in  order  to  conform  more  closely  with 
previously  published  investigations  on  non-conservatively  loaded 
systems,  such  as  those  reported  in  References  [48]  to  [51]  and 
[53],  for  example.  Therefore,  the  following  definitions  are 
made  : 


q  =  pf/c,  y  =  mgjf/c,  £=  6  =  yQ//. 


■H.  -  £  (m/ c )  * / 2 


</=£n,  S=<rX. 


(5.5) 


The  quantity  Q  denotes  the  dimensionless  load  parameter,  y  the 
gravity  parameter,  and  cd  the  natural  frequency  parameter.  The 
quantity  f*  is  a  measure  of  the  amplitude,  and  jfl.  is  the 
dimensionless  frequency  of  the  vertical  sinusoidal  motion  of  the 
point  of  support  of  the  pendulum.  The  product  of  £  and  -Q.  is 
designated  as  &\  it  is  a  measure  of  the  motion  of  the  point  of 


Itt 


LV 


KS 


8 


2  2 

support.  It  is  easily  shown  that  r  =  1/flr  ,  & 


'//tf  ,  and  Q 


n 

Q/ff  .  Using  the  quantities  defined  in  Equation  (5.5),  one  can 
express  the  frequency  equation  in  Equation  (5.4)  as 


to4  -  (Zz/2)[7  +  6«T*  -  V)  ♦  2«X  -  2)Q]3'4  ♦ 


~2 


*  (g4/4)  (2  ( 1  -  00  Q  2  +  2ty  -  3(1  -  oO  ( 1  -  y 


+  a2)  ]Q  +  2  -  ioy  +  ey2  +  cr2(9/2  -  y  + 


+  9<J  /4) ) 


0  . 


(5.6) 


The  value  of  the  dimensionless  critical  divergence  load  Qd 
is  determined  from  the  condition  of  vanishing  frequency  (S  =  0) . 
In  this  case,  Equation  (5.6)  becomes  simply 


8  ( 1 


*)Q*  +  8[y-  3(1  -  60  (1  -  y+or2)]Q,  +  8 
a  <1 


40y  +  24y2  +  or2  (18  -  36r  +  9<f2) 


0  . 


(5.7) 


This  is  a  quadratic  equation  in  ,  so  its  solution  can  be 
determined  by  elementary  methods.  It  is  evident  that  the  value 
of  the  critical  divergence  load  depends  only  upon  the  tangency 
coefficient,  and  the  gravity  and  base  motion  parameters. 

The  value  of  the  dimensionless  critical  f 1 ut ter  load  is 
determined  from  the  condition  of  the  coalescence  of  the  two 
natural  frequencies  of  vibration  of  the  system,  i.e.,  u)  ^  • 
This  implies  that  the  discriminant  of  the  quadratic  equation  in 
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in  Equation  (5.6)  must  vanish.  Expansion  of  this  discriminant 
leads  to  the  following  equation  for  Qf: 

4(2  -  2#  +  *2)Q*  +  4«X  -  8  +  4V  -  6<y2)Qf  +  41  - 


-  44V  ♦  12y2  +  02  (66  -  36y  +  27  <y2)  =  0. 


(5.8) 


Equation  (5.8)  is  a  quadratic  equation  in  .  Obviously,  the 
value  of  will  depend  upon  the  values  of  the  tangency 
coefficient  <x,  the  gravity  coefficient  V,  and  the  support  motion 
parameter  O'. 


If  one  sets  V  =  O'  =  0  in  Equations  (5.7)  and  (5.8)  ,  they 


become 


(1  -  oOQ.  -  3(1  -  0OQ  .  +  1  =  0, 
d  d 


4  ((X2  -  2*  +  2)Q*  +  4  (at  -  8)Qf  ♦  4  1  =  0, 


(5.9) 


(5.10) 


respectively,  which  are  the  expressions  equivalent  to  those 
reported  by  Herrmann  and  Bungay  [49]. 
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0.  NUMERICAL  RESULTS 


In  this  section,  the  information  regarding  the  state  of 
stability  of  the  double  pendulum  contained  in  Equations  (5.7)  and 
(5.8)  will  be  extracted  by  analytical  procedures,  in  so  far  as 
possible,  and  then  by  numerical  computations.  The  goal  is  to 
plot  stability  diagrams,  i.e.,  plots  of  the  critical  divergence 
and  flutter  loads  versus  the  tangency  coefficient  <X . 

Equation  (5.7)  can  be  expressed  as 


8(1  -  <X)Q.  -  8  ( b , 
a  I 


“V°d  *  b3  =  0. 


(6.1) 


where 


bl  =  3  -  4>  ♦  3<72,  b2  =  3(1  -  t  +  a2)  . 


b3  =  8(1  -  5V  +  zy2)  +  9cr2(2  -  4y  *  a2)  . 


(6.2) 


The  solutions  of  Equation  (6.1)  are  easily  shown  to  be 


Qdl  =  {8(bl  "  0lb2)  '  4"^C2(bi  -  *b2)2  "  (1  '  «>b3]1/2}/16(l  -  «)  . 


(6.3) 


Qd2  =  (8(b1  -  ocb2)  +  442C2(bj  -  scbj)*  -  (1  -  «)  b3  ] 1 7  2  >  /  1 6  ( 1  -  o<)  . 


Each  expression  in  Equation  (6.3)  represents  a  branch  of  the 
stability  diagram.  As  will  become  evident  later,  it  is  useful  to 
know  for  which  value  or  values  of  «  will  these  two  branches  merge 


(Qdi  =  Qd2>  •  Thi3  condition  implies  that 


2  (b}  -  *b2)  -  (  1  -  oc)b3  =  0 
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which  leads  to  the  following  quadratic  equation  in  a  : 

2bl<*2  ♦  (b3  -  4b1b2)ot  +  2b2  -  b^  =  0 .  (6.4) 

The  solutions  a  of  Equation  (6.4)  are  given  by 
m 

36(1  -  V  ♦  <T2)2«x  =  4(7  -  117  +  6Cf 2 )  ♦  3ff2(18  -  167  ♦ 

m 

♦  9<r2)  ♦  [8(1  -  57  ♦  3y 2 )  ♦  9cr2(2  -  47  +  a2)]1/2. 

2  2  1/2 

•[8(1  -  2 V)  +  3or  (6  -  4y  +  3 a  )]  .  (6.5) 


Thus,  the  values  of  g  depend  upon  7  and  o'  in  a  rather 
complicated  manner.  In  the  event  that  gravity  is  absent  (V  =  0) . 
Equation  (6.5)  leads  to 


10  ♦  ISO2  +  9 or 4 


ml 


18(1  ♦  O’1)4 


«m2  ■  1.  (6.6) 


When  <T  =  0  (i.e.,  there  is  no  vertical  motion  of  the  support 

point)  it  follows  from  Equation  (6.6)  that  «  ,  =  5/9  and  o<  n  =  1  , 

ml  m2 

which  are  the  values  reported  in  Reference  [49].  In  the  other 

extreme  as  the  value  of  tends  to  infinity,  Equation  (6.6) 

yields  <x  .  =  1/2  and  o<  „  =  1  . 
m  i  m2 

In  analogy  with  Equation  (6.1),  it  is  convenient  to  express 
Equation  (5.8)  in  a  more  compact  form: 

4aQ*’  +  4(o<-b)Q,+c  =  0,  (6.7) 

of  o  f  o 

where 
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2 


2  *  ♦  A2 , 


8 


4y  +  6 cy2  . 


.2  .  _2  _ 2,  (6.8) 


c  =  41  -  44V  ♦  12Y  +  O'  (66  -  36y  +  27<J  )  . 

o 


The  solutions  of  Equation  (6.7)  are 


2  7  i  /  o 

Q  =  (b  ■  «  i  («  -  2b  *  ♦  b.  -  a  c  )  l/2a  ,  (6.9) 

I  o  o  °  o  o  o 


which  provides  two  branches  of  the  stability  diagram  in  the  «Q- 
plane.  These  branches  can  exist  only  if  the  radicand  in  Equation 
(6.9)  is  positive.  •  Limiting  values  of  the  tangency  coefficient 


can  be  found  when  =  ®f2'  which  implies  that  the  radicand 


vanishes.  This  condition  means  that 


(c  -  1 )  ot2  +  2  (b  -  c  )*  +  2c  -b2  =  0 
o  o  o  o  0 


(6. 10) 


must 

hold. 

But  its 

solutions  are 

*t 

=  (33  - 

40V  > 

12y2  +  3<T2(20  - 

12V  +  9C2)  + 

♦  [41  - 

44V  + 

12y2  +  <r2(66  - 

36V  +  27 a2)  ]  1/2[9  - 

-  i2y  + 

4y2  + 

3<T2(6  -  4Y  +  3or2)  ]  1/2)/[40  - 

-  44V  + 

12V2  + 

<T2(66  -  36V  + 

27<r2)  1  . 

(6.11) 

If 

grav i ty 

is 

absent  from 

the  physical  system 

under 

consideration,  then  V =  0  and  Equation  (6.11)  becomes 
3(1  -*■  O'2 )  C  1 1  +  9 <S2  t  (41  ♦  66<J2  ♦  27<74)1/2] 


(4  +  3  O’1)  (  10  >  9£r3) 


(6.12) 
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If  the  point  of  support  is  stationary  (<7"=  0),  Equation  (6.12) 


yields 

Of.  =  3(11  ±  441J/40  =  0.3448.  1.3052. 

which  are  the  values  reported  in  Reference  [49].  As  the  value  of 
tends  to  infinity,  Equation  (6.12)  in  the  limit  leads  to 

<y.  =  1  +  43/3  =  0.4226.  1.5773. 

w 

The  expressions  for  the  values  of  the  critical  divergence 

and  flutter  loads  given  in  Equations  (6.3)  and  (6.9)  serve  to 

furnish  the  stability  boundaries  in  the  «Q-plane  that  are  plotted 

in  stability  diagrams.  As  a  first  set  of  stability  diagrams, 

suppose  that  the  gravitational  force  is  absent  (V  =  0).  Then  the 

values  of  Qd  and  Qf  become  functions  of  only  the  tangency 

coefficient  <*  and  the  support  motion  parameter  O'. 

If  the  point  of  support  remains  stationary  (O'  =  0) ,  then  the 

stability  diagram  becomes  that  reported  in  Reference  [49]  and 

shown  in  Figure  2.  The  various  regions  of  the  diagram  are 

delineated  by  boundaries  of  the  divergence,  flutter,  and 

stability  zones,  as  labeled  in  the  figure.  This  plot  reveals 

that  the  smallest  critical  load  for  the  system  will  be  a 

divergence  load  as  long  as  =  5/9.  Thus,  even  though  the 

loading  is  non-conservative  when  a  <  ot  and  oc  *  0 ,  the  system 

ml 

becomes  unstable  by  divergence.  For  «,<<*<«,  =  3(11  * 

ml  t 

4  4  1 ) / 4  0  ,  the  smallest  positive  critical  load  is  a  flutter  load. 
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A  tensile  critical  load  can  lead  to  divergence  when  «  >  1, 


1  .  e  .  , 


when  the  applied  load  is  super- tangential .  If  »  >  3(11  + 
471)  /40,  then  either  a  compressive  or  a  tensile  load  of  sufficient 
magnitude  can  lead  to  instability  through  divergence. 

For  or  >  0,  it  will  be  seen  in  Figures  3  to  5  that  the 
stability  boundaries  will  be  shifted  away  from  those  in  the 
reference  stability  diagram  in  Figure  2.  In  Figures  3  to  5,  the 
stability  boundaries  have  been  plotted  for  Of  =  1,  2,  and  3, 
respectively,  along  with  the  boundaries  for  OT  =  0  for  purposes  of 
comparison.  For.  compressive  critical  loads,  the  numerical  values 
are  shifted  higher  along  the  Q-axis,  whereas  the  boundary  is 
displaced  downward  for  tensile  super- tangential  loads.  The 
effect  of  the  displacement  of  the  stability  boundaries  becomes 
more  pronounced  as  the  value  of  the  support  motion  parameter  cr  is 
increased.  Clearly,  for  a  given  value  of  the  tangency 
coefficient  o» ,  the  sinusoidal  motion  of  the  point  of  support  of 
the  double  pendulum  increases  the  value  of  the  critical  load,  be 
i.t  a  divergence  or  a  flutter  load. 

To  render  evident  the  stabilizing  effect  of  the  method  of 
vibrational  control  when  applied  to  this  system,  it  is  convenient 
to  consider  a  couple  of  special  values  of  oi,  namely  c*  =  0  and  at  = 
1.  When  the  load  is  conservative  (*  =  0) ,  the  system  becomes 
unstable  by  divergence,  and  the  value  of  its  critical  load  can  be 
calculated  from 
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depending  upon  the  value  of  the  support  motion  parameter  <f . 

A  plot  of  the  variation  of  of*  as  a  function  of  o'  as 
determined  from  Equation  (6.12)  is  shown  in  Figure  7.  It  reveals 
that  the  values  of  ^  and  «t2  increase  monotonical  ly  and  very 
slowly  with  the  increasing  values  of  the  support  motion 
parameter . 

Scrutiny  of  the  flutter  zone  in  the  stability  diagram  shown 
in  Figures  2  to  5  leads  one  to  conclude  that  the  value  of  the 
critical  flutter  load  Qf  attains  a  minimum  value  in  the  *Q{-plane 
at  which  point  a  horizontal  tangent  exi s ts .  It  is  of  interest  to 
compute  min  Q  and  the  value  of  «,  say  o(,.,  at  which  this  minimum 
occurs.  To  accomplish  this  objective,  it  will  be  convenient  to 
use  Equations  (6.7)  and  (6.9).  Differentiating  Equation  (6.7) 
with  respect  to  «x  and  imposing  the  condition  that  dQf/d*  =  0,  one 
f inds  , 

min  =  1/2  ( 1  -  <*w)  .  (6  .  13) 

I  m 

The  branch  of  Equation  (6.9)  that  is  pertinent  to  the  present 
goal  is  the  expression  that  contains  the  negative  sign  before  the 
radical.  If  the  derivative  dQ^/d«  of  Equation  (6.9)  is  formed, 
then  the  condition  dQf/d*  =  0  leads  to 

1  /  2 

4(1  -  *  ) min  Q  *  1  +  [(1  -  c  ) *.  +  c  -  b  ]/R  .  (6.14) 

M  f  o  M  o  o 

where 


[i\ 


R  *  (1  -  c  )flCT,  +  2(c  -  b  )<*  ♦  b~  -  2c  .  (6.15) 
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Substitution  of  Equation  (6.13)  into  (6.14)  yields,  after 
rearrangement , 


Ra  =  (1  -  c  )«„  ♦  c  -  b  . 

o  M  o  o 


Squaring  both  sides  of  this  expression  and  rearranging  the 
result,  one  finds 


(c  -  1)0 (*  +  2  ( b  -  c  )0t  ♦  2c  -  2b  =  0 , 

o  M  o  o  M  o  o 


whence 


ofw  =  (33  -  40?  +  1 2  y2  ♦  3a2(20  -  12y  +  9  O'2)  i 

M 

t  C  (3  -  2-y)  2  +  3«r2(6  -  4V  ♦  3or2)  ]  1/2/{4  (  10  -  1  ly  +. 


+  3V2)  +  3a2  (22  -  12V  +  9<r2)). 


(6 . 16) 


In  the  special  case  of  the  absence  of  the  gravitational  force 
if  =  0),  Equation  (6.16)  becomes  simply 


3(1  +  O'2)  9(1  +  CT2) 


(6  .  17) 


4  +  30" 


10  +  9(7' 


The  first  of  the.  values  for  «  in  Equation  (6.17)  pertains  tc 

M 

the  minimum.  Therefore,  with  this  value  of  Equation  (6.13) 

becomes  simply 


min  Q,  =  2  +  3(J  /2 

«  i 


(6.18) 
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a  remarkably  simple  result. 


Hence,  if  the  support  point  remains 


1 


immobile  (<7*  =  0)  ,  then  min  Q  =  2.  Otherwise  the  value  of  min  Q 


increases  with  the  square  of  the  base  motion  parameter  <3. 


It  is  possible  to  derive  from  Equation  (6.9)  a  relatively 


simple  asymptotic  expression  for  as  cr  becomes  very  large.  The 


process  is  very  straightforward,  so  the  details  are  not  repeated 


here.  The  result  can  be  shown  to  be 


2  2  1/2 
3a  [2  -  (6*  -  2  -  3*  )  ] 


as  o’  — ►  oo  . 


(6 . 19) 


2(2  -  2*  +  ot  ) 


In  particular,  when  d  =  1,  i.e. ,  the  follower  force  is 


tangential,  ~  3(J  / 2  as  0T  tends  to  infinity. 


When  the  value  of  the  dimensionless  gravity  parameter  7  is 


positive  and  relatively  small,  the  stability  diagrams  continue  to 


resemble  those  in  Figures  3  to  5 ,  but  now  with  the  divergence 


boundaries  displaced  toward  the  o<-axis.  A  transition  in  the 


forms  of  these  boundaries  occurs  when  the  value  of  7  reaches  a 


particular  level.  This  happens  when- the  coefficient  b^  in 


Equation  (6.1)  vanishes.  Thus,  when 


b3  =  8(1  -  57  +  3  y2 )  +  9cf2  ( 2  -  4V  +  02)  =  0,  (6.20) 


Equation  (6.'1)  assumes  the  form 


Q.C  (1  -  oOQ  .  -  b.  +  (kb.  ]  =  0 

d  d  l  2 


whence 


ft 
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Qd  =  0  and  Qd  =  3(1  -  V  *  d  )  -  y/(l  -  o<  )  .  (6.21) 


Solving  Equation  (6.20)  for  the  transition  values  of  7,  say,  , 


one  f inds 


yT  =  cio  +  9<r2  +  (52  +  72a2  +  27a4)  1/2 3/ 12 . 


(6.22) 


Therefore,  from  Equations  (6.21)  and  (6.22)  ,  it  follows  that  the 


second  expression  for  becomes 


Q  =  3(1  +  d2)  +  (3e*  -  4)  [10  +  9a2  £(52  ♦ 
d  1 


2  4  1/2 

72 a  +  27a*  )  ]/ 12(1  -  on 


(6.23) 


for  ot  1.  It  should  be  noted  from  Equation  (6.1)  that  when  «  = 


1  then  QdT  =  0  is  the  only  solution.  Now  there  is  a  value  of  the 


tangency  coefficient  o<,  say  «  .  at  which  the  value  of  Q.m  will 

o  d  l 


vanish.  From  Equation  (6.21) ,  this  is  easily  shown  to  be 


<*  =  [3d  +  a2)  -  4y„]/3(i  +  d2  -  y_) .  (6.24) 

o  i  I 


As  a  special  case  of  the  results  shown  in  Equations  (6.22) 


to  (6.24) ,  one  has  for  CT  =  0  (the  point  of  support  is  stationary) 


VT  =  (  5  £  «(T3  )•  /  6  , 


QdT  =  3  *  (3*  -  4)  (5  +,  JT3)/6(1 


(6.25) 


0<  =  (9  ±  4 1 3  )  /6  . 


In  particular  for  7  *  (5  -  4Tz) /6  =  0.2324, 


it  follows  from 


Hi 


81 

to 


3 

a* 

I 
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Equation  (6.25)  that  ofQ  =  (9  -  *1 1 3 )  /  6  =  0.8991.  In  Figure  8,  the 
plotted  boundaries  of  the  divergence  zones  are  determined  from 
=  0  and 


Q  =  3  '♦  (3d  -  4)  (5  -  JT3 )  /6  (  1  -  «)  . 
dT 


(6.26) 


The  boundaries  for  the  flutter  zone  are  computed  as  before.  It 
may  be  observed  from  this  figure  that  =  0  is  a  critical  load 
when  f  ~  (5  -  m  ) /6  and  O'-  0.  This  implies  that  the  double 
pendulum  will  collapse  under  the  weight  of  the  concentrated 
masses.  A  new  feature  appears  in  this  plot,  namely,  the  aspect 
that  for  O'  =  0  the  left  hand  divergence  boundaries  are 
characterized  by  a  pair  of  intersecting  curves  rather  than  a  pair 
of  branches  that  terminate  in  a  point  at  which  a  vertical  tangent 
exists.  However,  as  soon  as  the  point  of  support  begins  to 
oscillate  at  small  amplitude  and  high  frequency,  the  intersection 
feature  disppears  and  the  more  familiar  shape  of  the  boundary  is 
restored.  To  illustrate  this,  the  divergence  and  flutter 
boundaries  have  also  been  plotted  in  Figure  8  for  the  combination 
of  parameters  <T  =  1  and  V  =  (5  -  *JT3)/6.  Again,  it  is  evident 
that  the  oscillation  of  the  support  point  tends  to  improve  the 
stability  properties  of  the  system. 

For  y  >  VT ,  the  boundaries  of  the  divergence  zones  in  the 
stability  map  change  their  character  still  more.  For  example, 
the  right  hand  branch  will  now  intersect  the  vertical  line  ot  =  1 
and  will  then  become  asymptotic  to  it.  Thus,  for  a  purely 
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tangential  force  (o<. 


1) ,  Equation  (6.1)  leads  to  a  single  root 


for  Qd ,  namely, 


‘  Qd  = 

-[8(1  -  5V  ♦ 

3  V2)  +  9<r2  ( 2 

-  4y  +  a2)  ] /8V. 

(.6 . 27) 

For 

example , 

when  y  =  1/4 

and  o'  =  1  /  2  , 

Equation  (6.27) 

leads  to 

Qd 

=  -37/32  = 

-  1  .  15625 . 

Since  the  stability  boundary  curve  intersects  the  line  *.  =  1 

and  eventually  becomes  asymptotic  to  it,  it  follows  that  this 

curve  must  possess  a  vertical  tangent  for  a  value  of  ot  that  is 

probably  slightly  less  than  unity.  An  expression  for  this  value 

of  oc  will  now  be  determined.  If  Equation  (6.1)  is 

differentiated  with  respect  to  Q,  and  the  derivative  d«/dQ,  is 

d  .  a 

equated  to  zero,  the  result  is,  after  a  little  rearrangement, 

Q,  =  (b.  -<Xb_)/2(l  -  *)  .  (6.28) 

a  l  z 

Substitution  of  Equation  (6.27)  into  Equation  (6.1)  yields 

2  ( b  ^  -  cXb2)  2  =  b3  (  1  -  on  . 

which  is  a  quadratic  equation  in  <X .  The  solutions  of  this  are 
easily  shown  to  be 

*  =  <4b,b2  -  b3  .  [b3(b3  .  8b*  -  Sbjbj) ]1/2>/4b*. 

or,  more  explicitly, 

<*  =  (4(7  -  117  +  6i2)  +  3<72(18  -  16V  +  9  o’2)  + 


825 


2  2  21/2 
i  [8(1  -  5V  ♦  3v  )  *■  9cr  (2  -  4V  +  <r  )  1 


[8(1  -  270  +  3<72(6  -  4y  +  3 O'2)] 


36(1  -  V  +  <72)  2. 


(6.29) 


In  the  special  case  of  y  =  1/4  and  <7*  =  1/2,  Equation  (6.29) 


yields 


<*  =  (491  +  492 1 )  /  576  =  0.7306,  0.9742. 


The  second  of  these  values  is  indeed  slightly  less  than  unity  as 


was  foreseen  above.  The  corresponding  values  of  Q.  can  be 

Q 


computed  from  Equation  (6.28).  The  results  are  Q,  =  1.0359  and 

a 


Q,  =  -3.3484,  respectively, 
d 


With  the  '  help  of  the  information  assembled  in  the  two 


preceding  paragraphs,  it  is  now  possible  to  plot  the  stability 


map  for  y  >  VT .  As  a  case  in  point,  Figure  9  has  been  plotted 


for  y  -  1/4  and  o'  =  0  (the  dashed  boundaries)  and  O'  =  1/2  (the 


solid  boundaries)  .  It  is  to  be  observed  that  for  <T  =  . 0  the 


boundaries  of  the  left  hand  divergence  zone  no  longer  coalesce 


for  some  value  of  ot  in  the  domain  0  <  <*  <  1  ,  as  was  the  case  in 


Figures  2  to  4.  Instead  of  curving  upward  as  the  value  of  o t  is 


increased,  the  lower  branch  curves  abruptly  downward  and  tends 


rapidly  in  an  asymptotic  sense  to  negative  infinity  as  the  value 


of  o<  tends  to  unity.  The  upper  branch  of  the  left  hand 


divergence  zone  continues  to  show  a  tendency  to  decrease  with 
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increasing  e<  •  It  initially  appears  to  approach  the  lower  branch 


but  then  veers  away  as  «  approaches  unity.  Thereafter,  for 


increasing  e*  ,  the  value  of  Qd  decreases  mono toni ca  1 1  y  and  tends 


asymptotically  to  =  0 .  '  The  region  between  these  two  branches 


is  a  zone  of  divergence.  The  region  below  the  lower  branch  is  a 


zone  of  stability.  The  right  hand  branch  of  the  divergence 


boundary  remains  very  similar  to  those  already  seen  in  Figures  2 


to  4.  The  region  above  it  is  a  zone  of  divergence.  The  flutter 


boundary  is  also  quite  similar  to  the  analogous  boundaries  shown 


in  Figures  2  to  4.  The  region  bounded  by  the  upper  boundary  of 


the  left  hand  divergence  branch,  the  lower  boundary  of  the 


flutter  zone,  and  the  upper  boundary  of  the  right  hand  divergence 


branch  enclose  a  zone  of  stability 


When  the  point  of  support  is  made  to  oscillate  harmonically 


such  that  <T  assumes  the  value  <J  =  1/2,  the  boundary  curves  are 


not  only  shifted  in  the  same  senses  as  they  were  in  Figures  2  to 


4,  but  the  branches  of  the  left  hand  divergence  zone  now  once 


again  coalesce.  The  low  amplitude,  high  frequency  motion  of  the 


point  of  support  obviously  overcomes  the  influence  of  gravity  and 


stabilizes-  the  system.  If  the  value  of  O'  were  increased,  the 


degree  of  stabilization  would  be  increased  accordingly. 
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from  the  averaged  system  are  a  sub-set  of  those  that  can  be 
derived  from  the  original  system  with  periodic  coefficients. 
This  observation  then  permits  the  drawing  of  conclusions 
regarding  the  stabilization  of  the  double’  pendulum  under 
consideration  based  upon  the  averaged  system  of  equations  of 
motion. 

It  was  showm  in  Sections  5  and  6  that  the  shape  of  the 
stability  boundaries  depends  upon  the  gravity  parameter  V  and  the 
induced  support  motion  parameter  C 7.  It  may  be  recalled  from 
Equation  (5.5)  that 


y  =  mgjf/c  and  <T  -  y  fl,  (m/c) 

o  1 


Thus,  (T  is  essentially  the  product  of  the  amplitude  yQ  and  the 
frequency  of  the  motion  of  the  point  of  support.  Stability 

maps  have  been  drawn  for  representative  values  of  the  parameters  y 
and  0*  to  illustrate  their  effects  upon  the  state  of  stability  of 
the  system.  All  the  calculations  reported  here  reveal  that  the 
values  of  the  critical  divergence  and  flutter  loads  for  a  given 
value  of  the  tangency  coefficient  oc  can  be  raised  significantly 
by  increasing  the  value  of  the’  parameter  6". 
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Figure  5.  Stability  diagrams  for  6  =  3  -  and 
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